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1  Introduction 


Databases  have  proven  to  be  a  very  useful  tool  for  the  storage,  retrieval,  and  manipulation  of  data 
in  an  organized,  and  systematic  fashion.  Commercial  database  systems  have  matured  over  the 
years  and  have  been  successfully  utilized  in  various  business  and  scientific  applications,  resulting 
in  a  multi-billion  dollar  industry  [Gra95,  Yan95].  Although  newer  and  later  developments  have 
an  object-oriented  flavor  to  varying  degrees,  the  basic  framework  of  databases  were  developed 
on  relational  technology  [Cod70].  At  the  heart  of  this  successful  paradigm  are  two  simple  but 
overwhelmingly  strong  abstractions  of  storing  data  in  tables  and  a  non-procedural  language  to 
query  such  tables. 

Additional  constraints  that  need  to  be  imposed  between  data  tables  or  between  attribute  values 
of  the  same  table  have  to  be  imposed  by  specifying  extra  conditions.  Functional  dependencies ,  which 
are  constraints  between  values  of  sets  of  attributes  in  a  data  table  is  the  focus  of  this  paper.  A  set 
of  attributes  Y  is  said  to  be  functionally  dependent  on  a  set  of  attributes  X  (denoted  X  -¥  Y)  if 
any  two  rows  that  have  the  same  values  for  attributes  in  X,  also  have  the  same  values  for  attributes 
in  Y.  Data  dependencies  of  various  kinds  were  defined  and  investigated  as  a  means  of  specifying 
and  enforcing  known  relationships  between  entities  in  a  database.  Relations  in  which  given  types 
of  dependencies  hold  among  entities  result  in  particular  normal  forms  [BBG78],  thereby  making 
cleaner  and  more  modular  data  tables.  The  modularity  is  necessary  to  maintain  proper  semantics 
during  insert,  delete  and  update  operations  [Var88].  There  are  algorithms  that  automatically 
produce  normalized  designs  of  logical  data  models  from  specifications  of  dependencies  that  exist 
between  attributes  in  a  relation  [U1188]. 

In  addition  to  enforcing  semantic  constraints,  functional  dependencies  have  many  other  uses 
such  as  in  semantic  query  optimization  [Bel96,  Dec87],  data  cleansing ,  where  the  nature  of  schema 
can  be  used  to  identify  invalid  entries  and  correct  some  erroneous  entries,  in  schema  integration, 
in  database  restructuring  [CAdS84,  MR94,  MR92a],  and  in  knowledge  categorization  [PS92].  The 
publication  [MR94]  lists  other  applications  of  functional  dependencies. 

If  functional  dependencies  are  known  at  schema  design  time,  they  can  be  used  in  the  design 
process  itself.  Conversely,  over  the  years  there  has  been  a  lot  of  collected  data,  without  apriori 
knowledge  about  their  dependencies,  requiring  the  need  to  mine  for  functional  dependencies  from 
attribute  values  in  databases.  In  process  of  mining  for  dependencies  the  search  for  dependencies 
holding  in  the  given  state  of  the  database  can  be  enhanced  by  accounting  for  logical  consequences 
of  already  mined  ones,  thereby  using  the  well  known  inference  rules  for  functional  dependencies, 
commonly  referred  to  as  Armstrong’s  axioms  [Arm74],  which  we  shall  refer  to  as  Armstrong’s  Rules. 

Algorithms  that  mine  for  functional  dependencies  such  as  [Bel95a,  Jan88]  use  Armstrong’s  rules 
in  the  stated  way.  In  such  algorithms  once  a  functional  dependency  is  known  to  fail,  it  is  equally 
expedient  to  weed  out  other  potential  functional  dependencies  that  would  imply  the  invalid  one. 
For  this  explicit  purpose  Functional  Independencies  were  proposed  in  [Jan88].  Hence,  eliminating 
mining  for  consequences  of  learned  dependencies  and  independencies  is  facilitated  by  finding  a  set  of 
rules  to  infer  new  dependencies  and  independencies  from  already  discovered  ones,  and  consequently 
a  complete  axiomatization  of  functional  dependencies  and  independencies  merit  interest. 

In  this  respect,  Janas  [Jan88]  presented  an  axiomatization  for  both  functional  dependencies  and 
independencies,  which  was  argued  to  be  incomplete  by  Bell  [Bel95a,  Bel95b].  We  find  some  of  the 
arguments  presented  in  these  two  publications  incomplete  and  inaccurate,  and  this  paper  remedies 
those  defects. 

Consequently,  this  paper  provides  a  syntactic  completeness  proof  of  a  complete  axiomatization 
of  functional  dependencies  and  independencies.  In  the  process  we  show  that  all  proofs  in  our  system 
have  normal  forms.  The  existence  of  normal  forms  can  be  exploited  by  a  proof  execution  engine  in 
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two  different  ways.  Firstly,  the  structure  of  proofs  that  we  look  for  is  restricted.  Secondly,  we  need 
not  search  for  any  non-normal  proofs,  and  that  results  in  considerable  savings  in  time. 

1.1  Independencies  and  Excluded  Dependencies 

Simultaneously  with  the  work  of  Janas  [Jan88]  there  has  been  work  done  in  excluded  function 
dependencies  (XFD’s)  [Tha88,  GL90].  Both  these  papers  use  excluded  functional  dependencies 
(XFD ’s)  to  refer  to  functional  dependencies  that  are  not  valid  in  any  given  instance  of  a  database, 
but  the  notions  of  completeness  used  in  them  are  remarkably  different.  In  [GL90],  a  set  A  of 
XFD’s  are  said  to  be  complete  if  there  is  a  database  instance  in  which  A  constitute  the  set  of  all 
invalid  dependencies.  Using  the  closed  world  assumption  they  show  how  to  construct  an  Armstrong 
relation  from  a  complete  set  of  XFD’s.  Conversely,  the  notion  of  completeness  given  in  [Tha88] 
is  the  same  as  ours,  and  using  the  deduction  theorem  for  closed  formulae,  it  shows  an  equivalent 
system  is  complete  for  functional  dependencies  and  independencies. 

1.2  Related  Work 

Dependency  theory  has  a  long  and  rich  history  as  been  summerized  in  [FV86,  Var88,  Kan90] 
In  addition  to  developing  diverse  notions  of  data  dependencies,  these  works  also  addressed  the 
issues  of  equivalence  and  relationships  between  them.  In  the  field  of  dependency  mining  there  are 
fewer  works.  Although  this  article  does  not  deal  directly  with  dependency  mining,  it  is  the  main 
beneficiary  of  our  work  and  hence  we  summerize  some  of  the  related  works. 

Mining  for  functional  dependencies  can  be  reduced  to  a  computing  a  small  cover  (a  set  of 
deductively  equivalent  set  of  functional  dependencies  holding  in  a  database  state  [Mai83]).  The 
work  reported  in  [MR87]  provides  an  efficient  algorithm  to  compute  a  small  cover  by  considering 
possible  counter  examples  for  assumed  functional  dependencies  (called  disagree  sets  in  [MR87]).  and 
complexity  bound  of  finding  a  small  covers  are  given  in  [MR92b].  [BMT89]  shows  that  for  relations 
of  modest  sizes  the  algorithms  presented  in  literature  for  dependency  mining  accomplish  their  task 
in  reasonable  time,  thereby  showing  that  tools  such  as  [BMR85]  that  use  such  algorithms  run  with 
acceptable  performance.  [SF96]  also  address  the  problem  of  mining  for  functional  dependencies  from 
relations  by  constructing  positive  and  negative  covers.  They  maintain  a  set  of  possible  dependencies 
and  independencies  in  the  potential  positive  and  negative  covers.  They  express  the  need  to  use, 
but  do  not  use  inference  rules  to  expedite  the  process  of  constructing  positive  and  negative  covers. 
Work  reported  in  [MR94]  presents  algorithms  to  extract  functional  dependencies  from  relations 
that  uses  optimizations  other  than  the  usage  of  inference  rules. 

1.3  Summary  of  Work 

We  show  that  Janas’  axiomatization  [Jan88]  is  incomplete  with  respect  to  functional  dependencies 
and  independencies  and  that  a  variant  of  Bell’s  axiomatization  [Bel95b]  in  conjunction  with  the 
Armstrong’s  Rules  is  complete.  These  new  axioms  are  called  the  FI  Axioms,  referring  to  the 
fact  that  they  are  axioms  for  functional  independencies.  Our  approach  follows  the  proof-theoretic 
tradition  [Tak91]  in  mathematical  logic. 

In  Section  2,  we  present  the  notations  used  and  review  appropriate  concepts  from  logic.  In 
Section  3  we  describe  various  proof-theoretic  properties  of  the  FI  axiom  system  to  show  soundness 
and  completeness.  In  Section  4,  we  show  that  Janas’  system  is  incomplete  with  respect  to  deriving 
functional  dependencies  and  independencies.  Departing  from  standard  practice  in  dependency 
theory,  in  Section  5  we  prove  that  every  proof  that  uses  FI  axioms  can  be  transformed  into  a  proof 
in  normal  form.  In  Section  6,  we  derive  the  consequences  of  the  normal  form  theorem  to  prove 
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completeness.  Some  of  the  more  detailed  auxiliary  results  in  this  section  are  proved  in  detail  in 
the  appendix.  In  Section  7,  we  show  that  the  FI  axioms  are  complete  for  functional  dependencies 
and  independencies.  In  our  approach,  we  develop  consistency  properties  to  create  models  and  then 
show  that  a  failed  attempt  to  derive  a  functional  independency  produces  a  complete  consistency 
property.  In  Section  8  we  show  the  connection  between  Armstrong  relations  and  our  construction 
of  counter  models. 

One  of  the  advantages  of  our  approach  is  that,  in  addition  to  giving  direct  proof-theoretic 
justifications  of  syntactic  results,  we  also  state  and  prove  a  normalization  theorem.  The  important 
property  of  this  normal  form  is  that  the  application  of  independency  axioms  are  limited  to  three 
levels  and  are  in  a  specific  order.  This  fact  can  be  utilized  when  searching  for  derived  independencies 
in  that,  one  need  only  look  for  proofs  that  satisfy  these  conditions.  Hence  the  running  time  of  the 
proof-search  procedures  are  reduced  significantly. 

2  Syntax,  Semantics  and  Proof  Rules 

This  section  contains  basic  terminology  used  to  formulate  and  prove  the  completeness  theorem  for 
functional  independencies. 

2.1  Syntax 

Our  syntax  consists  of  the  following  components: 

1.  U  is  the  set  of  all  attributes. 

2.  Subsets  of  attributes  (i.e.,  subsets  of  U)  are  denoted  by  upper  case  letters  (possibly  sub¬ 
scripted).  Union  of  subsets  X  and  Y  is  denoted  by  XY. 

3.  Attribute  values  are  denoted  by  lower  case  letters  (possibly  subscripted)  of  corresponding 
attribute  sets. 

4.  Two  connectives  -4  and  -/*  denote  respectively  dependencies  and  independencies,  and  the 
connective  C  denotes  subset  relationship  between  sets  of  attributes. 

5.  Sentences  of  the  form  (X  -4  Y)  and  (X  -ft  Y),  and  (X  C  Y)  where  X  and  Y  are  sets  of 
attributes  as  given  in  2. 

2.2  Semantics 

A  model  to  interpret  our  syntax  consist  of  a  data  table  that  has  all  elements  of  Z7  as  attributes. 
For  the  purposes  of  this  work,  we  assume  that  the  database  consists  of  a  universal  relation  (i.e.  all 
data  tables  in  a  database  as  one  data  table).  Rows  i  and  j  are  respectively  denoted  by  U  and  tj. 
The  values  of  attributes  corresponding  to  the  attribute  set  A  in  row  tj  is  denoted  by  tj[A]. 

Definition  1  (Satisfaction)  Let  T  be  a  model  and  A,  B  be  sets  of  attributes.  Then: 

1.  We  say  a  data  table  (model)  T  satisfies  functional  dependency  ( A  -4  B)  (Notation:  T  (= 
(A  -4  B))  [EN94],  if  for  all  rows  i  and  j  ofT  ifU[A]  —  tj[A],  then  U[B ]  =  tj[B\. 

2.  We  say  a  data  table  (model)  T  satisfies  functional  independency,  {A  ■/*  B)  (Notation:  T  (= 
(A  t4  B))  if  T  \£  (A  -4  B),  i.e  there  are  two  rows  i  and  j  of  T  with  ti[A]  =  tj  [A]  and 
ti{B\  tj\B\. 
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2.3  Rules  of  Inference 


Rules  of  inference  popularly  know  as  Armstrong’s  axioms  [EN94]  are  used  to  derive  functional 
dependencies  as  listed  below.  Keeping  with  the  spirit  of  this  terminology  we  denote  other  rules  of 
inferences  by  the  description  Axioms. 

Armstrong’s  Rules  (Axioms) 


Reflexivity 

If  XCY 

then 

y-f  x 

FD1 

Augmentation 

wcv 

x  ->  y 

FD1 

XV  -+YW 

Transitivity 

X  -*Y 

Y  Z 

FD3 

X  -¥  Z 

Armstrong’s  Rules  have  been  shown  to  be  complete  for  functional  dependencies  [U1188,  Mai83]. 
In  order  to  compute  the  set  of  valid  functional  dependencies  in  a  given  data  table,  the  concept 
of  functional  independency  was  proposed.  Analogous  to  Armstrong’s  Rules,  Janets  proposed  an 
axiomatization  [Jan88]  as  given  below. 

Janas’  Rules 

XfaY  J1 

XfaYZ 

XZfaYZ  J2 

XZfaY 

X^Y  X7 bZ  J3 

Y  faZ 

The  above  axiomatization  was  claimed  to  be  incomplete  by  Bell  [Bel95a,  Bel95b].  However 
no  satisfactory  proof  was  provided.  In  addition,  Bell  proposed  the  following  axiomatization  for 
functional  independencies. 

Bell’s  Rules 


WCV 

V  fa  YW 

51 

X  -+Y 

V  -fa- Y 

XfaZ 

52 

Y  Z 

Y  -fa  Z 

XfaZ 

53 

X-faY 

Following  Bell’s  work  we  propose  the  following  axiomatization,  which  in  the  presence  of  Arm¬ 
strong’s  rules  is  equivalent  (i.e.  has  the  same  set  of  theorems)  as  that  of  Bell’s.  The  only  difference 
between  our  rules  and  those  of  Bell’s  are  that  we  have  replaced  51  with  FR,  where  the  set  in¬ 
clusion  in  the  antecedent  has  been  replaced  by  a  dependency.  The  reason  for  this  change,  which 
will  become  clear  in  Section  5,  is  to  have  a  dependency  instead  of  set  inclusion  so  as  to  lend  proof 
method  to  a  more  syntactic  analysis. 

FI  Rules  (Rules  for  Functional  Independency  Inference) 
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V^W 

V-f+YW 

FR 

V  -f*Y 

X^Y 

X7 4Z 

FT2 

Y  -frZ 

Y  Z 

X  -f¥  Z 

FI3 

X  y 47 

By  constructing  appropriate  consistency  properties  for  proof  rules  (eg.  [Fit83]),  we  prove  that 
the  last  axiomatization  is  complete  for  functional  independencies. 


2.4  Equivalence  of  Bell’s  and  FI  Systems 

The  only  difference  between  proof  rules  we  use  and  those  proposed  by  Bell  [Bel95a,  Bel95b]  is  that 
the  antecedent  W  C  V  in  B\  has  been  replaced  by  (V  — >  W)  in  the  antecedent  of  FR.  In  this 
section  we  show  that  Bl  and  FR  are  equivalent  in  the  presence  of  Armstrong’s  rules.  In  order  to 
so  we  prove  FR  using  B\  and  vice-versa. 

Proving  FI 1  using  .Bl,  B3  and  FD2 


V^W  FD2 

VY  ->•  WY  V  -/>  YW  B3 

V  C  V  V  -ft  VY  Bl 

V-frY 


Proving  Bl  using  FH 


WCV  FD1 

V  W  V  t4  YW  FR 

V-^Y 

3  Proof-Theoretic  Properties 

We  first  prove  some  structural  theorems  about  proofs.  In  these  proofs,  we  use  notation  from  proof 
theory,  such  as  threads  in  proofs,  proof  fragments  and  equivalence  of  proof  fragments,  etc.  We 
provide  the  basic  definitions  here  and  refer  the  reader  to  a  standard  textbook  in  proof  theory  such 
as  [Tak91]  for  further  details.  We  do  so  because  our  proofs  are  proof-theoretic  in  nature,  as  opposed 
to  model-theoretic  proof  provided  by  Bell.  Hence  we  redefine  some  terminology  to  better  suit  our 
proofs.  Throughout  we  use  S  as  a  set  of  dependencies  and  S'  as  a  set  of  independencies. 

3.1  Notation  from  Logic 

Definition  2  (Rule)  A  rule  (of  inference)  is  an  expression  of  the  form  or  of  the  form  Sl^2 , 
where  S,  S\,  S2  andT  are  sentences.  In  these  rules,  S,  Si  and  Sy  are  respectively  called  antecedents 
and  T  is  called  the  consequent. 

Definition  3  (Proof)  A  proof  [Tak91]  P  is  a  tree  of  sentences  satisfying  the  condition  that  every 
non-leaf  node  and  its  children  constitutes  an  instance  of  an  inference  rule. 
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Throughout  this  paper  we  use  E  and  S'  for  sets  of  dependency  and  independency  sentences 
respectively. 

Following  customary  nomenclature,  the  sentences  at  the  leaves  of  a  proof  P  are  called  the 
assumptions  of  P  and  the  sentence  at  the  root  in  a  proof  P  is  called  the  conclusion  of  P.  Also, 
following  our  convention,  suppose  E  and  E'  are  respectively  sets  of  functional  dependencies  and 
independencies.  Then,  we  say  that  a  sentence  ip  is  a  logical  consequence  of  set  of  sentences  SUE' 
if  there  is  a  proof  of  ip  where  the  assumptions  are  taken  from  the  set  E  U  E',  and  where  the  rules 
of  inference  are  drawn  from  the  FI  system.  Then  we  also  say  that  ip  is  a,  logical  consequence  of  of 
EUE'.  We  use  the  notation  SUE'  I-  ip  to  indicate  so.  We  also  write  Cn(EuE/)  for  the  set  of  logical 
consequences  of  E  U  E'.  i.e.  Cn(EUS')  =  {i/»:EUE'l-  ip}.  Also,  proof  that  uses  only  Armstrong’s 
rules  (i.e.  FDl,  FD2  and  FD3)  is  called  a  FD-proof  and  one  which  involves  the  FI  rules  (i.e.  FIl, 
FT1  and  FIS)  is  called  an  independency  proof.  We  use  E  h fd  V’  to  indicate  that  there  is  a  proof 
of  ip  using  assumptions  from  E  with  rules  of  inferences  drawn  from  Armstrong’s  system.  Similarly 
S'  h/  ip  to  indicate  that  there  is  a  proof  of  ip  using  assumptions  from  S'  using  independency  rules. 
Similarly,  E  U  E'  hjanas  ip  indicates  that  there  is  a  proof  of  ip  with  assumptions  drawn  from  E  U  E' 
using  rules  of  Janas’  system. 

Definition  4  (Threads  in  Proofs)  A  sequence  of  sentences  is  called  a  thread  [ Tak91 ]  if : 

•  It  begins  with  an  assumption  and  ends  with  the  conclusion  and 

•  All  sentence  in  the  sequence  except  the  last  is  an  antecedent  of  an  inference  rule  and  it  is 
immediately  followed  by  the  consequent  of  the  same  inference  rule. 

Definition  5  (Fragment  of  a  Proof)  A  part  of  a  proof  which  itself  is  a  proof  is  called  a  fragment 
of  a  proof  (sometimes  called  a  subproof  [Tak91]). 

3.2  Domain  Specific  Results 

Notice  that  our  proof  system  is  stated  as  a  natural  deduction  system  [Pra65]  with  two  connectives, 
->  and  /k  In  this  section  we  prove  many  proof-theoretic  results  that  would  reveal  the  nature  of 
deductions  (i.e.  proofs)  in  our  system  and  eventually  lead  us  to  the  proof  of  the  completeness 
theorem.  Some  of  the  results  proved  have  appeared  in  [Bel95b,  Bel95a],  but  with  a  very  different 
flavor  of  proofs.  Some  proofs  given  in  [Bel95b]  and  [Bel95a]  are  inaccurate,  unjustified,  or  lemmas 
used  in  them  are  unproved  and  non-trivial.  Specifically  they  are  as  follows:  In  Lemma  1  of  [Bel95b], 
it  is  claimed  that  a  partially  filled  table  can  be  completed  without  affecting  E  because  a  set  of 
dependencies  and  independencies  E  U  E'  is  consistent.  Consistency  as  defined  in  this  paper  says 
that  there  is  some  data  table  that  satisfies  E  U  E*,  and  not  that  a  partially  filled  up  table  can  be 
completed.  Furthermore,  Corollary  1  (presumably  to  Lemma  1)  is  stated  without  a  proof,  and  we 
do  not  see  it  is  a  corollary  to  any  lemma  proved  up  to  the  statement.  We  prove  this  corollary  by 
syntactic  means.  In  Lemma  4,  where  the  incompleteness  of  Janas  system  is  shown,  at  one  step  it 
is  claimed  that  J1  and  J2  could  not  have  been  applied,  and  we  do  not  see  any  trivial  justification. 
In  Theorem  2  (Completeness  of  Bell’s  system)  it  is  not  clear  that  the  case  analysis  is  exhaustive. 
To  avoid  such  problems,  we  provide  all  necessary  proofs  in  complete  detail. 

We  begin  by  first  showing  that  the  addition  of  independencies  does  not  affect  the  derivable 
dependencies,  in  the  following  lemma. 

Definition  6  (Dependency  Property)  We  say  that  a  proof  system  A  has  the  Dependency  prop¬ 
erty  i/EUE'  bA  (X  — >•  Y),  then  E  \~FD  (X  ->  Y). 
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Lemma  1  (Dependence  Property  of  PI)  //SllS'l-(I-fy)  then  E  (I  ->  7) 

Proof:  By  induction  on  the  height  of  the  proof  tree  of  (X  -4  Y) 

Suppose  that  E  U  E'  h  (X  4  7).  Then  consider  the  proof  t  of  (X  -4  Y)  from  SUE'  with 
the  minimal  height.  Notice  that  the  only  proof  rules  that  have  -4  as  the  main  connective  in  the 
consequent  could  have  been  used  in  t  as  the  last  step.  Hence,  they  have  to  be  one  of  Armstrong’s 
rules,  FD  1,  FD  2  or  FD3. 

Case  1.  The  last  rule  used  to  deduce  (X  -4  Y)  is  either  FD  1  or  FD  2. 

Then  t  is  of  the  form: 


h 

P->  Q 
X^Y 

Thus,  \p%q-)  is  a  proof  of  ( P  -4  Q)  from  S  U  S'  with  a  length  shorter  than  that  of  t.  Hence, 

by  the  inductive  assumption,  there  is  a  proof  t[  of  ( P  -4  Q)  from  S.  Hence,  jx%y)  is  a  Pro°f  of 
{X  -4  V)  from  S. 

Case  2.  The  last  rule  used  to  deduce  (X  -4  Y)  is  FD  3. 

Then  t  is  of  the  form: 


_ £l _  _ ^2 _ 

Pi  ~4  Qi _ P>  — ^  Q2  FD3 

X  4  Y 

Consequently,  by  an  argument  similar  to  Case  1,  there  are  proofs  t[  and  t'2  respectively  of 
(Pi  -4  Qi)  and  (P2  -4  Q2)  from  E.  Hence,  the  following  is  a  proof  of  (X  -4  Y)  from  E: 

_ *2_ 

X4  Y 


At  the  heart  of  all  our  arguments  is  the  simple  but  powerful  fact  that  every  proof  in  this  system 
has  a  unique  proof  thread  in  which  the  major  connective  is  ■/¥. 

Definition  7  (Independency  Thread  and  Single  Independency  Thread  Property)  • 

A  proof  thread  in  which  the  connective  at  every  step  is  ■/>  is  said  to  be  an  independency  thread. 

•  If  a  proof  that  has  a  unique  independency  thread  is  said  to  have  the  single  independency  thread 
property. 

Lemma  2  (Single  Independency  Thread  Property  of  FI)  Every  FI  proof  has  at  most  one 
independency  thread.  If  the  conclusion  is  an  independency  then  it  has  an  independency  thread, 
otherwise  it  does  not  have  any. 
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Proof:  (By  induction  on  the  height  of  the  proof  tree) 

Suppose  t  is  a  proof  tree  with  the  least  height,  of  ip  from  SUE',  where  ip  is  either  a  functional 
independency  or  a  functional  dependency. 

Case  1.  ip  is  of  the  form  ( X  ft  Y) 

We  show  by  induction  on  the  height  of  t  that  there  is  exactly  one  thread  where  the  main  connective 
is  ft. 

In  this  case,  proof  rules  that  could  have  been  used  in  the  last  step  are  FI  1,  FI  2  or  F13.  Then 
t  is  of  the  following  form: 


V  -+  W  P  -ft  Q  F12 

X-ftY 

Consequently,  by  the  inductive  argument,  '[p^q^  that  has  (P  -ft  Q)  as  the  conclusion  and  a 
smaller  height  has  a  single  thread  of  independencies.  Hence  t  has  a  single  thread  of  independencies; 
namely  the  thread  that  extends  the  thread  in  Jp^q}  by  adding  (X  -ft  V)  to  its  bottom. 

Case  2.  ip  is  of  the  form  (X  -4  Y) 

In  this  case,  we  show  that  -ft  does  not  appear  in  the  proof  tree. 

In  this  case,  because  the  main  connective  of  the  conclusion  is  — only  Armstrong’s  axioms  (i.e. 
FDl,  FD2  or  FD3)  could  have  been  applied  at  the  last  step  of  the  proof.  If  the  last  rule  applied  is 
either  FDl  or  FD2,  then  t  is  of  the  following  form: 


h 

P^Q 

a  -4  y 

Then,  by  the  inductive  hypothesis  -ft  does  not  appear  in  Jp^q)  • 

Suppose  the  last  rule  used  is  FD3.  then  t  is  of  the  following  form: 

h  h 

Pi  -4  Qi  P2  -4  Q2  FD3 

x  -4  y 

Hence,  by  the  inductive  hypothesis,  ft  does  not  appear  in  either  for  *  =  1,2.  ■ 

Corollary  1  (Independence  Property:  Corollary  to  Lemma  2)  Let  £  be  a  set  of  functional 
dependencies  and  E'  be  a  set  of  functional  independencies.  If  E  U  £'  I-  (X  -ft  Y)  then  there  are 
some  R,  S  such  that  (P  -ft  S)  and  E  U  {( R  ft  5)}  h  (X  ft  Y). 

Proof: 

Suppose  t  is  a  proof  of  (X  ft  Y)  from  E  U  £'.  Then,  by  Lemma  2,  t  has  a  unique  independency 
thread.  Let  ( R  ft  S)  be  at  the  head  of  this  independency  thread.  Then,  (R  ft  S)  is  the  only 
functional  independency  that  is  being  used  as  an  assumption  in  t.  Hence  t  is  a  proof  of  (X  ft  Y) 
from  E  U  {{Rft  S)}.  m 
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4  Incompleteness  of  Janas’  System 

In  this  section,  using  proof-theoretic  arguments  we  show  that  proof  rules  of  Janas  are  incomplete 
for  functional  independencies.  In  particular,  as  stated  by  Bell  [Bel95a,  Bel95b],  we  show  that  the 
following  proof  rule  is  sound,  but  cannot  be  derived  in  Janas’  rule  system. 

X^Y  Z-frY 

Z-f>X 

In  order  to  justify  our  claim,  we  need  some  properties  about  Janas’  system,  which  are  in  the 
following  lemmas. 

Lemma  3  1.  The  following  proof  rule  is  sound. 

X-+Y  Z-frY 

Z-frX 

2.  Janas  ’  system  has  the  single  independency  thread  property. 

3.  It  has  the  dependence  property. 

Proof: 

For  the  proof  of  (1),  which  is  a  rather  trivial  fact,  see  [Bel95b],  and  [Bel95a].  Proof  of  (2)  and  (3) 
are  similar  to  the  corresponding  proofs  in  our  FI  system. 


Lemma  4  (Incompleteness  of  Janas’  Proof  System)  The  following  proof  rule  cannot  be  de¬ 
rived  in  Janas  ’  system  of  rules. 


X^Y  Z-frY  F5 

Z-f*  X 


Proof:  This  can  be  easily  seen  semantically.  A  detailed  syntactic  proof  appears  in  the  appendix. 


5  Normal  Forms  for  Proofs 

In  this  section,  we  prove  a  normal  form  theorem  for  proofs  in  our  system.  We  show  that  every 
proof  in  our  system  is  equivalent  to  one  in  which  there  are  at  most  three  applications  of  FI  axioms 
in  the  order  F73,  FR,  FTl. 

Towards  this  end  we  need  some  auxiliary  facts,  which  are  summarized  below. 

•  Repeated  applications  of  any  independency  rule  can  be  replaced  by  a  single  application  of 
the  same  rule. 

•  The  order  of  FTl  and  F13  can  be  interchanged. 

•  The  order  of  applications  of  independency  rules  FTl,  FR  or  FR,  FI 3  can  be  reversed,  but  not 
vice-versa. 

Section  5.1  is  devoted  to  precise  statements  of  these  facts,  which  are  proved  in  the  appendix. 
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5.1  Auxiliary  Facts 

Lemma  5  (Proof  Rule  Merging)  The  following  facts  hold  about  repeated  applications  of  proof 

rules. 

1.  A  sequence  of  successive  applications  of  ¥11  is  equivalent  to  a  single  application  of  ¥11,  i.e. 
given  a  proof  t  where  the  single  independency  thread  has  a  sequence  of  applications  of  ¥11,  is 
equivalent  to  a  proof  that  has  a  single  application  of  ¥11. 

2.  A  sequence  of  successive  applications  of  ¥12  is  equivalent  to  a  single  application  of  ¥12,  i.e. 
given  a  proof  t  where  the  single  independency  thread  has  a  sequence  of  applications  of  FI 2  is 
equivalent  to  a  proof  that  has  a  single  application  of  FI  2. 

3.  A  sequence  of  successive  applications  of  ¥13  is  equivalent  to  a  single  application  o/FI5,  i.e. 
given  a  proof  t  where  the  single  independency  thread  has  a  sequence  of  applications  of  FI5  is 
equivalent  to  a  proof  that  has  a  single  application  of  FI 3. 

Proof: 

See  Appendix  A.  ■ 


Lemma  6  (Proof  Rule  Interchangeability)  The  following  facts  hold  about  the  interchangeabil¬ 
ity  of  inference  rules  in  FI  proofs. 

•  For  every  proof  fragment  in  which  FI 3  is  applied  immediately  after  FI  2,  there  is  an  equivalent 
proof  fragment  in  which  FI 2  is  applied  after  FI5. 

•  The  following  hold  for  the  reversal  of  application  orders  of  rules  ¥11,  ¥12  and  FI 3. 

1.  For  every  proof  fragment  in  which  ¥11  is  applied  immediately  after  FI 2,  there  is  an 
equivalent  proof  fragment  in  which  FI 2  is  applied  after  FI1. 

2.  For  every  proof  fragment  in  which  ¥13  is  applied  immediately  after  FIJ,  there  is  an 
equivalent  proof  fragment  in  which  ¥11  is  applied  after  FI 3. 


Proof: 

See  Appendix  A.  ■ 

Using  Lemmas  5  and  6,  we  show  that  every  proof  in  our  system  can  be  reduced  to  a  normal 
form.  In  this  normal  form,  every  proof  has  at  most  three  applications  of  functional  independency 
rules,  and  furthermore  they  are  applied  in  the  order  FI  3,  FI  1  and  FI  2.  Accordingly,  we  define 
normal  forms  for  proofs. 

5.2  Proof  of  the  Normal  Form  Theorem 

In  this  section  we  state  and  prove  the  normal  form  theorem. 

Definition  8  (Normal  Form)  A  proof  is  said  to  be  in  Normal  Form  if  and  only  if  its  unique 
independency  thread  has  atmost  three  applications  of  independency  rules  in  the  order  FI  3,  FI  1, 
FI  2,  if  they  do  appear  at  all. 
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Now,  we  show  a  weak  normalization  theorem,  namely  that  every  proof  in  in  our  system  has 
a  normal  form.  The  proof  of  the  normal  form  theorem,  while  syntactic  in  nature,  consists  of 
three  main  steps.  In  the  first  step,  we  use  the  lemma  5,  and  reduce  successive  applications  of  the 
same  independency  rule  to  a  single  application  of  the  rule,  resulting  in  a  proof  without  successive 
applications  of  the  same  rule.  This  lets  us  visualize  the  independency  thread  as  consisting  of 
a  sequence  of  blocks  where  each  block  begins  by  an  application  of  F71  ,  and  is  followed  by  an 
application  of  either  FT2  or  F/3,  followed  by  the  other  rule.  Then  we  show  that  interchangeability 
lemmas  can  be  used  to  reduce  such  a  proof  segment  to  the  order  FIS,  FI 1,  F12.  Lastly,  we  show 
that  any  two  successive  blocks  cam  be  reduced  to  a  single  block. 

Definition  9  (Block)  A  fragment  of  a  proof  is  said  to  be  a  block  if  it  has  an  independency  thread 
in  which  either: 

•  There  are  at  most  three  applications  of  distinct  independency  rules,  of  which  the  first  one  is 
FI  1,  and  the  other  two  are  applications  of  distinct  independency  rules  FI 2  and  FI 3  in  any 
order. 

•  Or  there  are  at  most  two  applications  of  distinct  independency  rules  FI 2  and  FIS  in  any  order. 

Definition  10  (Normal  Block)  A  fragment  of  a  proof  is  said  to  be  a  normal  block  if  it  is  a  block 
in  which  the  independency  rules  are  applied  in  the  order  FI S,  FI  1,  FI  2. 

Lemma  7  (Blocking  of  Proofs)  Suppose  SUE'  h  (X  7 4  Y).  Then  there  is  a  proof  t  of(X  ft  Y) 
in  which  the  unique  independency  thread  consists  of  a  sequence  of  blocks,  of  which  only  the  first 
block  (i.e.  the  block  at  the  top  of  the  independency  thread)  may  miss  an  application  ofFIl. 

Proof: 

Suppose  EUE*  h  ( X  ft  Y).  Then  there  is  a  proof  t\  of  (X  ft  Y)  from  EUE*.  By  applying  lemma  5 
to  t\,  we  obtain  a  proof  <2  of  (X  ft  Y)  from  SUE',  that  does  not  contain  successive  applications 
of  FR,  FR  or  FI 3. 

Then,  define  the  blocks  in  <2  as  the  proof  segments  starting  with  any  application  of  FIl  and 
extending  up  to,  but  excluding  the  next  application  of  FIl  along  the  unique  independency  thread. 
If  the  first  rule  of  application  is  not  FR,  then  the  first  block  may  contain  FR  and/or  FI 3  in  any 
order.  ■ 


Lemma  8  (Block  Normalization)  For  every  block  there  is  an  equivalent  normal  block. 

Proof: 

Suppose  6  is  a  block.  Then,  by  definition,  the  unique  independency  thread  of  b  does  not  have  an 
application  of  FR,  in  which  case  (if  need  be)  Lemma  6  can  be  used  interchange  the  application 
order  of  rules  FR  and  FI 3  to  make  it  a  normal  block,  or  it  has  an  application  of  FR  at  the  top  of 
the  independency  thread,  i.e.  at  the  beginning  of  the  independency  thread. 

If  the  first  rule  of  application  in  the  independency  thread  is  FR,  and  the  order  of  application 
of  other  rules  is  FR,  FR,  then  by  Lemma  6,  there  is  an  equivalent  proof  fragment  61  where  the 
rules  are  applied  in  the  order  FR,  FI 3,  FR.  Then,  by  Lemma  6  there  is  an  equivalent  normal  block 

h normal •  M 
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Lemma  9  (Normal  Block  Merging)  A  sequence  of  two  normal  blocks  can  be  reduced  to  a  nor¬ 
mal  block. 

Proof: 

Suppose  a  proof  fragment  t  consists  of  two  successive  normal  blocks  a  and  b.  Let  the  blocks  a 
and  b  both  have  the  application  of  all  three  independency  rules  FI3,  FF1  and  FI2,  respectively 
denoted  as  03,01,02  and  63,61,62-  By  using  the  interchangeability  lemmas  and  merging  they  can 
be  transformed  into  a  proof  in  normal  form,  as  given  below. 

1.  Apply  Lemma  6  to  get  a  proof  segment  in  which  the  order  is  03, 01, 63, 02,61, 62. 

2.  Apply  Lemma  6  to  get  a  proof  segment  in  which  the  order  is  03, 63, 01,02, 61, 62. 

3.  Apply  Lemma  6  to  get  a  proof  segment  in  which  the  order  is  03, 63,  ai,  61, 02, 62. 

4.  Apply  Lemma  5  to  respectively  merge  successive  applications  of  rules  FI 1,  FI2  and  FI 3  in 

03, 63,  ai,  61,  and  02, 62  to  a  single  application  of  respective  proof  rules. 

In  the  cases  of  degenerate  blocks,  i.e.,  where  application  of  one  or  two  independency  rules  are 
missing,  we  can  still  apply  the  same  procedure  to  group  application  of  similar  rules  together.  Some 
steps  in  the  process  will  have  become  redundant  because  of  the  absence  of  some  of  the  independency 
rule  applications.  The  details  are  given  in  the  appendix.  ■ 

Theorem  1  (Normal  Form  Theorem  for  Proofs)  Suppose  S  U  S'  I-  (X  -ft  Y).  Then  there  is 
a  normal  form  proof  of  ( X  •/*  Y)  from  S  U  S'. 

Proof: 

Suppose  t  is  a  proof  of  (X  -ft  Y)  from  S  U  S'.  Then: 

1.  Apply  transformations  given  in  the  Lemma  7  to  obtain  a  proof  t\  in  which  the  independency 
thread  consists  of  blocks. 

2.  Apply  transformations  given  in  the  Lemma  8  to  every  block  in  t\  to  obtain  an  equivalent 
proof  *2  in  which  every  successive  block  is  a  normal  block. 

3.  Inductively  apply  the  transformation  given  in  Lemma  9  to  blocks  of  <2  to  obtain  an  equivalent 
proof  <3,  which  consists  of  a  single  block. 


5.3  Proof-Theoretic  Properties  of  Functional  Dependencies 

In  this  section,  we  show  some  proof-theoretic  properties  that  are  used  in  constructing  Armstrong 
relations. 

Lemma  10  (Merging  Lemma  for  Functional  Dependencies)  Successive  applications  of  the 
Augmentation  Rule  (i.e.  FD 2)  is  equivalent  to  a  single  application  of  FD2;  i.e.,  given  a  proof  t  in 
which  there  are  two  successive  applications  of  FD 2  on  a  proof  thread,  they  can  be  replaced  with  a 
single  application  of  FD2. 

Proof: 

See  Appendix  A. 
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Lemma  11  (Interchange  Lemma  for  Functional  Dependencies)  Any  proof  fragment  in  which 
the  order  of  application  is  FD5,  FD  2  can  be  replaced  by  a  proof  fragment  in  which  the  order  of  ap¬ 
plication  is  FD2,  FD5. 

Proof: 

See  Appendix  A. 

The  results  in  Lemmas  11  and  10  can  be  combined  to  show  that  all  proofs  for  FD’s  can  be 
transformed  in  to  a  standard  form  called  semi-normal  form. 

Definition  11  (Semi-Normal  Form  for  Functional  Dependency  Proofs)  We  say  that  a  proof 
t  of  a  functional  dependency  is  in  semi-normal  form  if  it  satisfies  the  following  properties. 

•  The  application  of  FD 2  in  t  is  limited  to  once  per  proof  thread  in  t. 

•  If  FD  2  is  applied  in  a  proof  thread  in  t,  then  it  is  applied  to  the  top  sequent  of  the  thread. 

We  now  show  that  every  proof  in  FD  has  a  semi-normal  form. 

Theorem  2  (Semi-Normal  Form  Theorem  for  FD  Proofs)  Any  proof  of  a  functional  depen¬ 
dency  ( X  —>  y)  can  be  transformed  to  a  proof  in  semi-normal  form. 

Proof: 

By  applying  Lemma  10,  successive  applications  of  FD2  can  be  replaced  by  a  single  application  of 
FD2,  and  by  applying  Lemma  11,  applications  of  FD2  can  be  pushed  upto  the  top  sequents  of  proof 
threads.  ■ 

In  the  next  theorem  we  show  that  for  any  proof  in  which  any  given  dependency  X  — ¥  Y  appears 
more  than  once  as  an  assumption  can  be  replaced  with  an  equivalent  proof  in  which  it  appears  only 
once  as  an  assumption.  To  prove  this  result,  the  following  definition  is  in  order. 

Definition  12  (Transitive  Envelope  of  a  FD  Proof  Tree)  Consider  a  FD  proof  t  in  semi¬ 
normal  form.  Suppose  Ti, . . . ,  Tn  is  a  left-to-right  listing  of  all  proof  threads  of  t.  Then  a  listing 
(-Xi  Yi),  — ,  (Xn  — >  Yn)  of  functional  dependencies  satisfying  the  following  properties  is  called 
the  transitive  envelope  oft. 

•  {Xi  — >  Yj)  is  on  Tj  for  all  i  <  n.  Suppose  the  position  at  which  (X*  — »  Yf)  appears  in  r ,  is  7,. 

•  7,-  is  the  farthest  position  from  the  conclusion  of  t  where  there  are  no  application  of  FD2 
between  'fi  and  the  conclusion  of  t. 

The  following  definition  states  properties  of  transitive  envelopes,  which  are  needed  in  later 
proofs. 

Definition  13  (Chains  of  Dependencies)  A  listing  of  dependencies  of  the  form  (X  — »  Xi),  (X\  — > 
X2), . . .  (Xn  — *  Y)  is  said  to  be  a  chain  of  dependencies.  We  say  that  X  is  the  head  and  Y  is  the 
tail  of  the  chain. 

The  next  theorem  proves  an  important  property  of  a  transitive  envelope  of  a  FD  proof  tree. 

Theorem  3  (Structural  Property  of  Transitive  Envelopes)  Suppose  t  is  a  semi-normal  form 
proof  of  (X  — >•  y)  and  T  is  the  transitive  envelope  oft.  IfT  is  non-null,  then  it  is  a  chain  with 
head  X  and  tail  Y;  i.e.  (Xx  -*  X2), . . .  (Xn  -»  Xn+i)  where  X  is  X\  and  Y  is  Xn+i. 
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Proof: 

See  Appendix  A. 


As  Theorem  3  states,  the  transitive  envelope  of  a  FD  proof  is  a  chain.  The  next  theorem  shows 
that  repeated  assumptions  in  this  chain  can  be  removed,  i.e.  that  cycles  can  be  removed. 

Theorem  4  (Repetition  Removal  from  Transitive  Envelopes)  Every  proof  t  of  a  functional 
dependency  (X  -4  Y)  in  which  the  transitive  envelope  T  has  a  repetition  of  some  functional  de¬ 
pendency  (A  -4  B),  can  be  reduced  to  a  proof  t'  in  which  (A  -4  B)  is  not  repeated  in  its  transitive 
envelope. 

Proof: 

See  Appendix  A.  ■ 

In  the  next  theorem  we  show  that  proofs  in  FD  can  be  reduced  to  a  form  where  assumptions 
that  are  functional  dependencies  are  used  atmost  once.  In  order  to  prove  it,  we  need  the  following 
lemmas. 

Lemma  12  (Some  Useful  Proof  fragments)  Following  are  auxiliary  facts. 

1.  There  is  a  FD  proof  of  (F  -4  WY)  from  assumptions  (F  -4  A),  (A  -4  XV)  and  W  C  V. 

2.  There  is  a  proof  of  (YV  -4  YB)  from  assumptions  W  C  V,  {YW  -4  XA)  and  B  C  A. 

Proof: 

See  Appendix  A. 

The  results  such  Lemma  12  state  some  obvious  monotonicity  facts  about  -4  and  ft  with  respect 
to  C. 

Lemma  13  (Fusing  FD  Proofs)  Suppose  <i, . . .  tn  is  a  sequence  of  proof s  that  have  respectively, 
(Ai  -4  A2), . . . ,  (An  -4  An+i)  as  their  conclusions;  then  there  is  a  proof  of  {A\  -4  An+i)  that  has 
the  same  assumptions  as  those  oft\,...tn. 

Proof: 

By  applying  FD3  repeatedly,  we  can  create  a  proof  of  (A\  -4  An+i)  from  the  chain  (Ai  -4 
An), . . .  (An  -4  An+ 1).  By  fusing  the  proof  trees  t;  —  1  on  top  of  (A,  -4  A,+i)  for  all  2  <  i  <  n,  we 
get  the  desired  result.  ■ 

Now,  we  use  lemma  12  to  generalize  Theorem  4. 

Theorem  5  (Repetition  Removal  from  Assumptions)  For  every  proof  t  of  (E  -4  F)  in  FD 
and  every  assumption  (X  -4  F)  used  in  t,  there  is  an  equivalent  proof  t'  in  which  the  assumption 
(X  -4  F)  is  used  at  most  once. 

Proof: 

See  Appendix  A.  ■ 
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5.4  Proof  Inversions 

In  this  section,  we  show  that  proofs  that  assert  a  functional  dependency  can  be  constructively 
transformed  into  proofs  that  assert  functional  independencies,  and  vice  versa.  Specifically,  we  show 
that  if  53  U{X  ft  Y}  b  (P  ft  Q)  then  53U{P  — >■  Q}  h  {X  — »•  Y)  and  vice  versa.  This  fact  is  later 
used  in  the  proof  of  the  completeness  theorem.  The  results  contained  in  this  section  seemed  trivial 
from  semantic  consideration.  They  are  stated  for  the  sake  of  completeness  sake  and  to  show  that 
the  syntactic  method  used  throughout  this  paper  is  capable  of  showing  all  necessary  facts. 

Definition  14  (Inverse  Fragments)  Consider  the  following  proof  fragments. 


4- 


<1 

t2 

FD  3 

X^Y 

Y  ->Z 

X^Z 

ti 

X^Y 

X  ft  Z 

FI  2 

Y  ft  Z 

ti 

Y  Z 

XftZ 

FI  3 

XftY 

AD  B 

X  -f  Y  FD  2 

AX  -4  BY 


AD  B 

AX  ->  B  AX  ft  BY  ¥11 

AX^X  AX -frY  FI  2 

XftY 

t 

6.  V  -+W  V  ft  YW  FU 

VftY 

t 

7  V  C  V  V  ->W  FDS  WCW  V  -» Y  FD2 

V  -¥  VW  VW  ->  YW  FD5 

V^YW 


In  these  proof  fragments,  (  2)  and  (  3)  are  said  to  be  respectively  the  left  and  the  right  inverse 
of  (  1),  and  conversely  (  1)  is  said  to  be  the  inverse  of  (  2)  and  (  3).  Similarly,  (4)  and  (  5)  are 
said  to  be  inverses  of  each  other  and  (  6)  and  (  7)  are  said  to  be  the  inverses  of  each  other. 

We  denote  the  inverse  of  proof  fragment,  left  inverse  and  right  inverse  proof  fragment  of  f 
respectively  as  f~x,  f£x  and  f^1. 

Now  we  show  the  following  properties  about  inverse  fragments. 

Lemma  14  (Properties  of  Inverse  Fragments)  The  proof  fragments  listed  in  Definition  14 
have  the  property  that  if  the  fragment  prove  ( P  Q)  from  (X  -¥  Y),  possibly  using  t\,  then  its 
inverse  fragment  (if  applicable,  left  and  right  inverses)  proves  (X  -ft  Y)  from  ( P  ft  Q)  (the  inverse 
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uses  t\  if  the  original  fragment  used  t\).  Conversely,  if  a  proof  fragment  listed  in  Definition  14 
proves  ( P  7 4  Q)  from  (X  ft  Y)  using  the  assumption  t\,  then  its  inverse  proves  ( X  ->  Y)  from 
(P  — >■  Q)  using  the  assumption  t\. 

Proof: 

The  fragments  and  their  inverses  (left  and  right,  if  applicable)  are  listed  in  Definition  14,  with 
the  corresponding  proof  rules  used  to  justify  the  fragment.  ■ 


Definition  15  Suppose  t  is  a  proof  in  FD  and  7  is  a  thread  in  t  where  the  topmost  sequent  of  7  is  a 
functional  dependency,  say  {X  Y).  Then  define  the  7  inverse  oft  (Notation  t-1(  y))  inductively 
as  follows. 

Base  Case: 

Suppose  t  consist  of  only  (X  ->  Y).  Then  define  f-1( 7)  as  (X  ft  Y). 

Inductive  Case: 

Let  t'  be  the  proof  that  uses  the  consequent  of  the  first  application  of  a  proof  rule  to  {X  — >  Y)  as  its 
assumption.  Let  7'  be  the  proof  thread  in  t'  that  is  obtained  by  removing  the  first  sequent  from  7. 

•  Suppose  the  first  proof  rule  applied  on  7  is  FD  2,  and  let  f  be  the  proof  fragment  that  constitute 
the  application  of  FD  2,  say  B£x*by  •  V  f  constitutes  all  oft,  then  define  i-1(  7)  as  f~l. 

Otherwise,  define  t-1( 7)  as  the  proof  obtained  by  fusing  the  consequent  of  to  the 

assumption  that  is  the  only  functional  independency  in  f~l.  The  next  theorem  shows  that 
this  functional  independency  is  (AX  ft  BY),  so  that  they  can  be  fused,  and  the  resulting  tree 
constitutes  a  valid  proof. 

•  Suppose  the  first  proof  rule  applied  on  7  is  FD5,  and  that  (X  -*•  Y)  is  the  left  antecedent  of 
that  application  of  FD5.  Let  f  be  the  proof  fragment  corresponding  to  this  application.  If  f 
constitutes  all  oft,  then  define  t-1( 7)  as  f£l. 

Otherwise,  define  t-1( 7)  to  be  the  proof  obtained  by  fusing  the  consequent  of  t'-1  (7')  to  the 
assumption  that  is  the  only  functional  independency  in  f£l,  say  (Aft  B).  The  next  theorem 
shows  thati ?~1('/)  proves  (A  ft  B),  so  that  they  can  be  fused,  and  the  resulting  tree  constitutes 
a  valid  proof. 

•  Suppose  the  first  proof  rule  applied  on  7  is  FD 3,  and  that  (X  -*  Y)  is  the  right  antecedent  of 
that  application  of  FD 3.  Let  f  be  the  proof  fragment  corresponding  to  this  application.  If  f 
constitutes  all  of  t,  then  define  t~x  (7)  as  f£x . 

Otherwise,  define  t~l  (7)  to  be  the  proof  obtained  by  fusing  the  consequent  of  1  (-£)  to  the 
assumption  that  is  the  only  functional  independency  in  f£l,  say  (A  ft  B).  The  next  theorem 
show  that  i7-1  (V)  proves  (A  ft  B),  so  that  they  can  be  fused,  and  the  resulting  tree  constitutes 
a  valid  proof. 

Let  t  be  a  proof  in  which  functional  independency  (X  ft  Y)  is  at  the  top  of  the  unique  indepen¬ 
dency  thread.  Then  define  the  inverse  oft  (Notation  t*1)  inductively  as  follows. 

Base  Case: 

Suppose  t  consist  of  only  (X  ft  Y).  Then  define  t~l  as  (X  Y). 
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Inductive  Case: 

Let  t'  be  the  proof  that  takes  the  consequent  of  the  first  application  of  the  appropriate  proof  rule  to 
(X  i 4  7).  Then  define  t~l  as  follows. 

Let  f  be  the  proof  fragment  corresponding  to  the  first  application  of  a  proof  rule  (FI  1,  ¥12  or 
FI  3  ).  If  f  constitutes  all  oft,  then  define  the  inverse  of  t  to  be  /-1 

Otherwise,  define  i-1  to  be  the  proof  obtained  by  fusing  the  consequent  of  t'~l  to  the  right 
assumption  of  proof  fragment  of  /-1.  The  next  theorem  shows  that  they  can  be  fused,  and  the 
resulting  tree  constitutes  a  valid  proof. 

In  the  next  theorem,  we  show  that  the  results  shown  in  Lemma  14  about  inversions  of  proof 
fragments  carry  over  to  inversions  of  complete  proofs. 

Theorem  6  (Properties  of  Inverse  Proofs)  The  proofs  listed  in  Definition  15  have  the  prop¬ 
erty  that  ift  proves  ( P  -»  Q)  from  {X  -4  Y),  possibly  using  a  set  of  functional  dependencies,  say 
E,  then  its  inverse  proof  indexed  by  a  thread  7,  t-1( 7)  proves  (X  ft  Y)  from  (P  ft  Q),  possibly 
using  E.  Conversely,  if  t  proves  ( P  ft  Q)  from  (X  ft  Y)  and  other  dependencies  E,  then,  t~l 
proves  ( X  -4  Y)  from  E  U  {P  -4  Q}. 

Proof: 

See  Appendix  A. 

6  Consequences  of  the  Normal  Form  Theorem 

In  this  section,  we  explore  the  consequences  of  the  normal  form  theorem  which  are  relevant  in  the 
completeness  proof.  In  order  to  do  so,  we  need  to  define  consistency  for  a  set  of  sentences. 

Definition  16  We  say  that  SUE'  is  consistent  if  E  U  S'  \f  {W  ft  V),(W  -4  V)  for  some  sets  of 
attributes  W  and  V.  Here  E  is  a  set  of  dependencies  and  E'  is  a  set  of  independencies. 

Lemma  15  (Inconsistency  Test)  Suppose  E  is  a  set  of  dependencies  and  S'  is  a  set  of  inde¬ 
pendencies.  ELIS'  is  inconsistent  if  and  only  if  there  is  an  independency  (P  ft  Q)  €  S'  such  that 
SI -  (P  -*•  Q) 

Proof: 

See  Appendix  A.  ■ 


Lemma  16  (Consistency  when  adding  a  dependency)  Suppose  E  is  a  set  of  dependencies 
ondE'  is  a  set  of  independencies.  If  SUE'  is  consistent  and  SUE'  \f  (X  ft  Y)  then  EUE'U{X  -4  Y} 
is  consistent. 

Proof: 

See  Appendix  A.  ■ 

Lemma  17  (Consistency  when  adding  an  independency)  Suppose  E  is  a  set  of  dependen¬ 
cies  and  E'  is  a  set  of  independencies.  //SUE'  is  consistent  and  E  U  S'  \f  (X  -4  Y)  then 
EUE'U  {X  ft  Y}  is  consistent. 

Proof: 

See  Appendix  A.  ■ 
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7  Completeness  of  the  Proof  System 

In  this  section,  we  present  the  consistency  properties  for  the  Fl-system.  Then  we  show  that  every 
consistency  property  yields  a  model.  Finally  we  show  the  completeness  theorem  by  proving  that 
if  E  U  S'  \f  ip,  then  there  is  a  complete  consistency  property  that  satisfies  E  U  S'  but  not  ip.  This 
kind  of  proofs  are  common  in  model  theory  of  first  order  logic. 

Definition  17  (Consistency  Property)  We  say  that  a  set  C  is  a  consistency  property  if  follow¬ 
ing  hold. 

•  Non-Contradictory  Nature 

For  every  R,S  CU  not  both  (R S)  GC  and  (RftS)GC  hold. 

•  Closure  Under  Proof  Rules 

C  is  closed  under  proof  rules;  i.e 

1.  IfXCY  then  (Y^X)gC 

2.  If  (B  A)  GC  and  X  CY  then  {YB  -4  XA )  €  C 

3.  If  (X  -4  Y),  (Y  -4  Z)  G  C  then  (X  -4  Z)  G  C. 

4.  If  ( B  -4  A),  (B  ft  Y)  G  C  then  {AftY)G  C. 

5.  If  (Y  -4-  Z),  (X  ,4  Z)  GC  then  (X  Y)  G  C. 

6.  If  {X  Y),  {X-f>Z)  GC  then  (Y  -/>  Z)  G  C. 

•  Disjunctive  Nature  of  -f¥ 

If  S  =  {Si  :  1  <  *  <  n}  where  each  Si  is  a  single  attribute,  and  (R  ■/>  S)  G  C,  then 
( R  ■/¥  Sf)  G  C  for  some  i  <n. 

Definition  18  (Complete  Consistency  Property)  We  say  that  a  set  C  is  a  complete  consis¬ 
tency  property  if  following  properties  hold. 

•  C  is  closed  under  the  proof  rules  given  in  Definition  1 7. 

•  For  every  R,S  CU  one  and  only  one  of  (R—tS)G  C,  (R-fr  S)  G  C.  hold. 

The  next  lemma  show  an  important  property  of  complete  consistency  properties. 

Lemma  18  (Conjunctive  and  Disjunctive  Nature  of  Consistency  Properties)  Suppose  C 
is  a  set  of  dependencies  and  independencies. 

•  Conjunctive  Nature  of  -4 

If  C  is  a  consistency  property,  then  it  satisfies  the  conjunctive  nature  of  -4.  i.e.,  if  S  =  {Si  : 
1  <  t  <  n}  and  (R  -4  S)  G  C,  then  (R  — >  Si)  G  C  for  all  i  <  n. 

•  Disjunctive  Nature  of  -ft 

If  C  is  a  complete  consistency  property,  then  it  satisfies  the  disjunctive  nature  of  ft.  i.e.,  if 
S  =  {Si  :  1  <  i  <  n}  where  each  Si  is  a  single  attribute,  and  (RftS)G  C,  then  (R  ft  Si)  GC 
for  some  i  <  n. 
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Proof: 

The  conjunctive  nature  of  -*  holds  in  a  consistency  property  C  because  (R  -¥  S)  h Armstrong  {R  -> 
Si),  and  C  is  closed  under  deduction. 

If  the  disjunctive  nature  of  ft  is  not  true  in  a  complete  consistency  property  C,  then  there  is 
5  =  {5i  :  1  <  i  <  n}  where  each  Si  is  a  single  attribute,  and  (R  ■/>  S)  £  C,  but  (R  -ft  Si)  &  C  for 
all  i  <  n.  Then  (R  — >  Si)  £  C.  But  {R  — *  Si  :  1  <  i  <  n}  b  (R  — »•  S),  leading  to  a  contradiction, 
because  now  (R  — ►  S),  (R  -ft-  S)  €  C. 


Lemma  19  (Complete  Consistency  Property  and  Consistency  Property  )  Every  complete 
consistency  property  is  a  consistency  property. 

Proof: 

Suppose  C  is  a  complete  consistency  property.  By  Lemma  18,  C  satisfies  the  disjunctive  nature  of 
74.  Hence,  C  is  a  consistency  property.  ■ 


7.1  Constructing  Models  from  Consistency  Properties 

Theorem  7  (Constructing  Models)  IfC  is  a  consistency  property,  then  there  is  a  model  M(C) 
with  the  following  properties 

1.  If(R-frS)£C  then  M{C)  (R  -fr  S). 

2.  If(R->S)€C  then  M(C)  (=(#-»  S). 

3.  If  C  is  a  complete  consistency  property  then 
M(C)  (R  -ft  S)  implies  (R  ■/¥  S)  €  C. 

4-  If  C  is  a  complete  consistency  property  then 
M(C)  (=  (R  -4  S)  implies  (R  S)  £  C. 

Proof: 

In  this  construction,  we  assume  that  the  domain  of  every  attribute  can  take  at  least  countably 
many  values.  First,  we  construct  the  model  M(C)  as  follows. 

Construction  : 

Let  U  be  the  set  of  all  attributes.  We  construct  the  model  M (C)  in  stages,  i,  called  the  ith  segment 
Mi(C)  of  M(C).  Each  M,(C)  consisting  of  two  rows  of  a  table  (model)  as  follows. 

1.  Suppose  A  =  U{B  C  U  :(&  —y  B)  £  C}  For  each  attribute  A,  £  A,  let  Oj  be  an  attribute 
value  valid  in  its  domain. 

2.  For  each  attribute  S  &  A  (where  1  <  i  <  n)  where  there  is  some  set  of  attributes  R  satisfying 
the  condition  R  -f*  S  £  C,  let  S  be  the  set  of  all  such  maximal  attribute  sets  R.  Formally  S 
can  be  defined  to  satisfy  the  following  properties. 

•  Any  R!  £  S  satisfies  (R!  -/*  S)  £C. 

•  If  any  attribute  set  Rl'  satisfies  (R"  -ft  S)  £  C  then  there  is  some  attribute  set  R'  £  S 
satisfying  R'  D  R". 
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•  If  R'  £  S',  then  R"  S  for  any  proper  subset  R"  of  R. 

Let  W  be  Use(t/\/i)  &  an^  {Wj :  1  <  i}  be  a  listing  of  elements  of  W. 

3.  Wj  =  {5  :  (Wj  7 4  S')  €  C}.  Now  we  construct  Mj(C)  consisting  of  two  rows  (say  row  0  and 
row  1)  by  filling  in  the  attributes  as  follows. 

(a)  For  every  attribute  that  appear  in  A,  say  A^,  fill  in  its  value  by  a*. 

(b)  For  each  S  £  Wi,  fill  the  value  of  S  in  rows  0  and  1  with  s^o  and  Sj,i  where  they  satisfy: 

(1)  Both  Sito  and  Sj,i  are  valid  for  their  domains. 

(2)  Sj,0  $»,1* 

(3)  They  do  not  appear  in  any  table  segments  created  so  far. 

(c)  Let  Wi+  =  U{F  Q  U  :  (Wj  -»  V)  £  C}  Fill  all  the  corresponding  attribute  values  of 
W+  in  both  rows  0  and  1  with  the  same  set  of  values  that  have  not  appeared  in  any 
other  table  segment  created  so  far. 

(d)  Fill  other  (unfilled  thus  far;  i.e.  U  \  Wi+  \  Wj  \  A)  attribute  values  of  in  both  rows  0  and 
1  with  two  sets  of  values  that  satisfy: 

•  None  of  them  have  appeared  in  any  other  table  segment  created  so  far. 

•  None  of  the  corresponding  component  values  in  two  rows  are  equal. 

These  choices  are  possible  because  of  the  assumption  that  every  domain  of  attribute 
values  in  U  is  countable. 

4.  Notice  that  except  for  attribute  values  filled  in  for  Wj  and  A  in  rows  0  and  1  of  the  same 
table  segment  Mj(C),  none  of  the  other  attribute  values  are  equal. 

We  show  that  our  construction  satisfies  the  required  properties  in  the  following  lemma. 

Lemma  20  M(C)  constructed  in  Theorem  7  satisfies  following  properties. 

1.  If  (R  ft  S)  £C  then  M(C)  | =  (Rft  S). 

2.  If  (R-*  S)  €C  then  M{C )  f=  (R  ->•  S). 

3.  If  C  is  a  complete  consistency  property 
M{C)  \={R-/>S)  implies  {R^S)£  C. 

4-  If  C  is  a  complete  consistency  property 
M(C)  1 =(R^S)  implies  (R^S)e  C. 

Proof : 

To  show  (  1): 

Suppose  (R  -f*  S)  e  C.  Then  by  definition  of  C,  (R  -ft  Si)  €  C  for  some  singleton  subset  Sj  of  S. 
Hence,  by  construction  of  M(C),  there  is  a  maximal  W*  such  that  R  C  W*  and  (W^  ft  S{)  £  C. 
Consequently,  in  Mjt(C),  attribute  values  of  W*  in  rows  0  and  1  have  the  same  values  and  the  at¬ 
tribute  values  of  Sj  Eire  distinct.  Hence  M(C)  |=  (W*  ft  Si).  Hence,  M(C)  f=  (R  ft  Sj).  Therefore, 
M{C)  \=(Rft  S). 
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To  show  (  2): 

Suppose  (P  -4  S)  £  C.  Then  by  definition  of  M(C),  for  all  singleton  subsets  Si  of  S ,  (R  — >  Si)  £  C, 
because  C  is  closed  under  deduction  and  (R  -4  S)  h Armstrong  (P  -4  Si).  We  show  that  M(C)  f= 
(P  — >  Si). 

Notice  that  there  is  no  P  D  R  with  (P  ft  Si)  £  C.  For  if  not,  then  (P  -4  R)  £  C,  (because 
P  D  R  (P  R))  and  hence  by  FT1,  (R  ft  Si)  £  C,  (because  C  is  closed  under  deduction), 
contradicting  (P  -4  Si)  €  C.  Hence,  R  g  Wk  for  any  Wk  used  in  the  construction  of  M*(C). 
Furthermore  the  attribute  values  were  chosen  so  that,  except  for  the  attributes  from  A,  no  two 
rows  across  distinct  segments  have  the  same  attribute  values.  Now,  if  R  C  A,  then  (A  -4  R)  £  C 
and  hence  (0  -4  Si)  £  C,  and  hence  Si  has  the  same  attribute  value  across  all  rows,  satisfying 
M(C)  f=  (P  -4  Si).  Conversely,  if  R  %  A,  then  there  is  an  attribute  of  R  that  is  not  in  A.  Two 
distinct  rows  in  M ( C )  with  the  property  that  they  have  same  values  for  attributes  in  R  happens 
only  when  R  C  W£.  In  this  case  Wk  -4  R  £  C  and  hence  Wk  -4  Si  £  C  implies  M*(C).  Con¬ 
sequently,  M(C)  ft  (R  -4  Si).  Because  M(C)  |=  (P  — >  Si)  for  every  i,  we  get  that  M(C)  |=  (P  -4  S) 

To  show  (  3): 

Suppose  M(C)  (=  (P  ft  S).  Then  there  is  a  singleton  subset  Sk  of  S  satisfying  M(C)  ft  (R  ft  Sk). 
Then  there  are  two  rows  in  M (C)  that  have  the  same  value  for  attributes  in  P  and  different  values 
for  attributes  of  Sk-  By  construction,  except  for  attributes  from  A ,  only  pairs  of  rows  from  the 
same  segment  of  M (C)  have  equal  value  vectors. 

Now  suppose  RCA.  Then  (<f>  R)  £  C,  and  thus  Sk  <2  A,  for  if  not,  then  Sk  C  A,  an  hence, 
by  construction  all  rows  of  M(C)  have  the  same  value  for  Sk .  Therefore,  (A  ->  Sk)  &  C,  for,  if 
not,  then  (A  — >  Sk)  6  C,  and  hence  (0  -4  Sk)  G  C,  and  hence  by  the  definition  of  A,  Sk  C  A. 
Because  C  is  a  complete  consistency  property  (A  -ft  Sk)  G  C.  By  applying  P/2  to  (A  ft  Sft  £  C 
and  (A  -4  P)  £  C,  we  get  (P  ft  Sk)  G  C.  By  the  deductive  closure  of  C,  we  get  that  (P  ft  S)  £  C. 

Now  suppose  R%  A.  Hence,  both  rows  of  attributes  in  P  that  have  equal  values  vectors  must 
come  from  the  same  segment,  (say)  Mt(C).  Then  P  C  Wi-  To  show  that  (W[  ft  Sft  G  C,  notice  that 
Sk  has  distinct  values  in  rows  0  and  1  in  Wt  imply  that  Sk  &  Wf,  and  hence  (W;+  -4  Sft  &  C.  Conse¬ 
quently,  because  C  is  a  complete  consistency  property  (Wt+  ft  Sk)  G  C.  But  (Wj  ft  Sk)  h  (P  ft  Sk) 
and  (Wi  ft  Sk)  (Rft  S).  Because  C  is  closed  under  deduction,  we  get  ( RftS)£  C. 

To  show  (  4): 

Suppose  M(C)  (=  (P  -4  S)  and  (P  -4  S)  0  C.  Because  C  is  a  complete  consistency  property, 
(P  ft  S)  £  C.  By  Part(  2),  M(C)  ft  (Rft  S ),  contradicting  the  assumption  M(C)  (=  (P  -4  S).  ■ 


7.2  Constructing  Consistency  Properties 

In  this  section,  we  show  how  to  produce  a  consistency  property  from  an  underivable  sentence. 

Theorem  8  Suppose  X  is  a  set  of  functional  dependencies  and  is  a  set  of  functional  indepen¬ 
dencies  and  E  U  1/  If  X  U  is  consistent,  then  there  is  a  complete  consistency  property  C, 
satisfying  SUS'CC  and  ip  &C. 

Proof: 

Let  L  =  {(Pi,  Qi) :  0  <  i}  be  a  list  of  all  pairs  of  subsets  of  U  such  that  Po  is  X  and  Qo  is  Y,  where 
ip  is  either  ( X  -4  Y)  or  (X  ft  Y).  By  stages  {i  <  u>}  construct  the  consistency  property  C  as  follows: 
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At  Stage  0: 


1.  If  ip  is  the  dependency  {X  ->■  Y).  By  Lemma  17,  £  U  S'  U  {X  -/*  Y}  is  consistent.  Then 
define  C(0)  =  Cn{ £  U  S'  U  {X  -fr  F». 

2.  If  ip  is  the  independency  (X  ■/*  Y).  By  Lemma  16,  £  U  £'  U  {X  ->■  Y}  is  consistent.  Then 
define  C(0)  =  Cn{ £ US'U{I->  F}). 

Notice  that  in  both  cases,  C(0)  is  consistent. 

At  Stage  i  +  1  >  0  : 


1.  If  C(i)  {Pi  Qi ),  define  C{i  + 1)  as  Cn{C{i)). 

2.  If  C{i)  I i  {P{  — >  Qi),  by  Lemma  17,  C{i)  U  {Pi  -/+  Qi}  is  consistent.  Hence  define  C{i  +  1)  as 

Cn{C{i)  U  {Pt  A  Qt-}) 

Notice  that  at  every  stage  i,  following  hold. 

•  If  C{i)  is  consistent,  then  C{i  + 1)  is  consistent.  Therefore,  at  every  stage  t,  we  have  that  C{i) 
is  consistent. 

•  {Pi  —>  Qi)  €  C{i  + 1)  or  {Pi  -/)■  Qi)  €  C{i  + 1). 

Let  C  =  Uo<t  C(t).  Then  C  is  a  consistency  property.  This  is  true  because  of  the  following  facts. 

1.  C  is  non  contradictory. 

There  are  no  attribute  sets  R  and  S  satisfying  {R  -/*  S),  {R  ->  S)  G  C  because,  C{i)  satisfies 
that  property  for  each  i,  due  to  the  consistency  of  C{i),  and  C{i  + 1)  D  C{i)  for  all  i  >  0.  The 

construction  at  stage  i  +  1  ensures  that  either  (Pj  — >  Qi)  €  C{i)  or  (Pj  -ft  Qi)  €  C{i)  for  all 

i  >  0. 

2.  C  is  closed  under  deduction. 

This  is  because  of  the  finitary  nature  of  the  proof  rules.  For  suppose  C  h  6,  then  because  any 
proof  of  0  uses  finitely  many  assumptions  from  C,  there  is  a  stage  i  where  C{i)  b  0.  Hence 
0  €  C{i  +  1),  as  C{i  + 1)  D  Cn{C{i))  implying  0  £  C 

Finally,  ip  &  C  because  of  the  following  reasons. 

•  If  ip  is  a  dependency  {X  — >•  V),  then  {X  •/*  Y)  £  C,  and  because  of  the  non-contradictory 
nature  of  C,  {X  ->  Y)  g  C. 

•  If  ip  is  a  independency  {X  -ft  Y),  then  (A  — >•  Y)  G  C,  and  because  of  the  non-contradictory 
nature  of  C,  {X  ■/*  Y)  C. 
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7.3  Proving  the  Completeness  Theorem 
In  this  section,  we  prove  the  completeness  theorem. 

Theorem  9  (Completeness  of  the  Proof  System)  Our  proof  rules  are  complete  for  functional 
dependencies  and  independencies;  i.e  if  M  (=  ip  whenever  M  |=  E  U  S',  then  SUS'hi/i.  Here  E  is 
a  set  of  functional  dependencies  and  E'  is  a  set  of  functional  independencies. 

Proof: 

Suppose  not,  then  there  are  sets  E,  E'  and  a  sentence  ip  satisfying  SUE'  \f  ip.  Then  we  produce 
a  model  M  satisfying  M  f=  E  U  E',  but  M  ^  ip. 

1.  If  E  U  E'  \f  ip,  then  by  Theorem  8,  there  is  a  complete  consistency  property  C  satisfying 
EUE'eC  and  ip  gC. 

2.  By  Theorem  7,  there  is  a  model  M{C)  satisfying  M(C)  |=  E  U  E',  and  M(C)  ^  ip.  This  is 
true  because  M(C)  j=  7r  if  and  only  if  n  £  C  for  any  dependency  or  independency  n. 


8  Armstrong’s  Relations 

In  this  section,  we  show  the  connection  between  Armstrong’s  relations  [BDFS84]  and  our  proof  of 
completeness.  For  a  given  set  of  functional  dependencies  Armstrong’s  relations  are  relations  that 
satisfy  all  those  and  only  those  functional  dependencies  that  are  logical  consequences  of  the  given 
set. 


8.1  The  Classical  Case:  Another  Construction 

Given  a  set  of  functional  dependencies  E,  [BDFS84]  shows  how  to  produce  an  Armstrong  relation. 
Their  construction  is  as  follows. 

•  Let  A  be  defined  as  U{B  C  U  :  {<p  -»•  B)  €  C }. 

•  For  each  dependency  ip  where  El /ip,  construct  a  model  with  two  rows  that  satisfy  E  but 
not  ip,  satisfying  the  property  that  for  each  attribute  in  A  take  the  same  value  across  models 

for  all  such  El /  ip. 

•  Let  Af  =  where  |+)  is  the  disjoint  union;  i.e.,  M  is  constructed  by  taking  all  rows 

of  all  Jiff’s. 

For  a  given  set  of  consistent  functional  dependencies  E,  we  can  create  an  Armstrong  relation 
by  using  our  construction  as  follows.  Suppose  {X»  Yi  :  i  >  1}  =  {ip  :  H  \f  ip}.  Then,  we  show 
that  E  U  {X,  -f*  Yi  :  i  >  1}  is  consistent.  Suppose  En  =  E  U  {X*  Y;  :  i  <  n}.  Due  to  lemma 
1>  En  Y Armstrong  (Xj  -¥  Yi).  Hence,  by  Lemma  17,  En  U  {X,  7 4  Vi}  is  consistent.  Therefore,  by 
induction,  E  U  {X{  -ft  Yi  :  i  >  1}  is  consistent  and  in  fact  it  is  a  complete  consistency  condition. 
Therefore  by  Theorem  9,  E  U  {X,-  t4  Yi :  i  >  1}  has  a  model,  say  M.  Then  M  ip  if  and  only  if 
E  !-  ip.  Hence  Xi  is  an  Armstrong  relation.  A  careful  examination  of  the  construction  of  Theorem 
20  shows  that  Xi  has  the  potential  of  a  model  with  a  smaller  number  of  rows  than  the  construction 
given  in  [BDFS84]. 
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8.2  Armstrong  Relations  in  the  presence  of  Independencies 

In  light  of  known  results  and  the  importance  of  Armstrong  relations,  a  natural  question  that  arises 
is:  given  a  set  of  functional  dependencies  and  independencies,  is  there  a  relation  that  satisfies  all 
those  and  only  those  that  are  logical  consequences  of  the  given  set  ?  The  answer  to  this  question 
is  that  in  general  there  is  no  relation  that  can  satisfy  above  stated  requirements  as  shown  in 
the  following  example.  Suppose  a  relation  schema  has  attributes  A,  B,  C ,  D  and  E,  and  let  the 
set  of  dependencies  and  independencies  be  2  U  2'  =  {(A  -4  B),  ( B  ft  C)},  where  2  is  the  set  of 
dependencies  and  2'  is  the  set  of  independencies.  Then  {(A  -4  B),  ( B  -ft  C)}  I /  (D  -4  E),  (D  ft  E). 
Notice  that  there  is  no  relation  that  satisfies  both  (D  -4  E)  and  (D  ft  E).  The  reason  for  the 
failure  above  is  that  2  U  2'  U  {(D  -4  E),  (D  ft  E)}  is  inconsistent  in  our  proof  system. 

In  general,  following  is  possible.  Suppose  2  and  2'  are  respectively  a  set  of  dependencies  and 
independencies  where  2  U  2'  is  consistent.  Then,  any  consistent  complete  extension  2"  of  2  U  2' 
has  a  model,  where  complete  means  for  any  sets  of  attributes  X,Y  either  (X  -4  Y)  G  2"  or 
(X  ft  Y)  €  2".  This  is  derivable  from  our  theorems. 

9  Use  of  Inference  Rules 

In  [Bel95a]  and  [Bel95b]  it  is  shown  how  to  use  proof  rules  in  the  inference  of  functional  dependencies 
from  data  values.  In  this  work,  a  Prolog-based  inference  engine  is  interleaved  with  the  mining 
engine.  The  inference  engine  adds  newly  mined  dependencies  and  independencies  to  an  existing 
known  set.  When  the  mining  engine  is  used,  it  omits  mining  for  facts  in  that  have  already  been 
derived,  or  rejected  on  the  basis  of  derived  independencies. 

In  [GJS+96]  it  is  shown  that  for  probabilistic  functional  dependencies,  the  performance  of  a 
mining  algorithm  can  be  enhanced  based  on  inference  rules  to  reject  and  accept  already  derived 
dependencies. 

In  other  general  data  mining  work  such  as,  [AIS93]  and  [AS95],  proof  rules  are  not  explicitly 
used,  but  based  on  the  properties  of  the  dependencies  that  is  being  mined  for,  some  facts  are 
automatically  accepted  or  rejected.  Since  all  that  proof  rules  do  is  generate  new  facts  from  already 
known  facts,  those  usage  of  properties  can  be  considered  as  using  an  inference  engine  to  some 
extent.  One  thing  that  a  complete  proof  procedure  does  to  such  a  process  is  to  provide  a  complete 
set  of  properties  that  can  be  used  in  such  circumstances. 

10  Conclusions 

In  this  paper,  we  have  presented  a  sound  and  completeness  axiomatization  of  functional  indepen¬ 
dencies.  We  have  also  outlined  the  proof  of  completeness  of  this  system  using  a  syntactic  method. 
One  of  the  consequences  of  the  completeness  proof  is  that  a  straightforward  method  for  generating 
the  Armstrong  relation  for  a  given  set  of  dependencies  is  obtained.  The  second  advantage  of  this 
axiomatization  is  that  we  have  shown  that  every  proof  in  this  system  has  a  normal  form  with 
atmost  three  levels  of  application  of  Fl-rules,  and  this  can  be  used  to  search  for  proofs  with  very 
high  efficiency.  Consequently,  as  shown  by  Bell  [Bel95a],  these  rules  can  be  profitably  used  to  mine 
for  functional  dependencies.  Mining  of  FDs  can  prove  useful  in  various  situations  such  as  semantic 
query  optimization,  database  design,  and  database  restructuring.  Other  applications  [PK95]  where 
the  search  space  can  be  pruned  using  both  positive  and  negative  knowledge  can  also  take  advantage 
of  this  axiomatization.  In  the  same  vein,  other  data  mining  applications  domains  such  as  associ¬ 
ation  rules,  sequential  patterns  can  also  take  advantage  of  negative  knowledge  of  relationships,  as 
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well  as  the  positive  knowledge.  Search  mechanisms  will  only  be  reinforced  with  such  capabilities. 

Our  goal  in  this  direction  is  to  produce  a  general  mining  engine  which  utilizes  both  positive 
and  negative  knowledge  as  discussed  above.  Functional  dependencies  present  themselves  as  a  prime 
candidate  for  this  application  due  to  their  highly  structured  nature  and  well-known  properties.  For 
constructing  a  mining  system  for  FDs,  we  have  to  have  highly  efficient  and  mechanizable  proof 
methods.  Such  a  proof  method  using  tableaux  are  presented  in  [WGSN97]. 
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A  Appendix 

Proof  of  the  Incompleteness  of  Janas’  System  (Lemma  4) 

(By  Contradiction) 

Consider  the  case  where  X,  Y,  and  Z  each  consists  of  one  attribute  and  they  are  all  different  from 
each  other.  Assume  there  is  a  proof  t  of  (Z  ft  X)  from  {X  -+Y,Z  ft  Y}. 

The  plan  of  our  proof  is  as  follows.  By  induction  on  the  number  of  applications  of  J3  in  the 
independency  thread,  we  show  that  there  is  a  proof  of  (Z  ft  X)  from  (X  -4  Y,  Z  ft  Y}  that  does 
not  use  J3.  Next  we  show  that  it  is  impossible  to  prove  ( Z  ft  X)  from  {X  -4  Y,  Z  ft  Y}  only  by 
using  J1  and  J2. 

Suppose  t  is  a  proof  of  Z  ft  X  that  uses  J3.  Then  the  first  application  of  J3  in  the  independency 
thread  must  be  of  the  following  form: 

Z-+T  ZftP  J3 

T  ft  P 

Hence,  by  By  part  (3)  of  Lemma  3,  {X  -4  Y}  \~fd  (Z  — >■  T).  This  is  impossible  by  the 
completeness  of  Armstrong’s  rules  and  the  existence  of  counter  models  for  (X  -4  Y}  \/fd  (Z  -4  T), 
except  when  T  C  Z  or  Z  D  X  and  T  D  Y.  But  notice  that  Z  D  X  is  impossible  by  our  choices  of 
attributes.  In  this  case,  the  application  of  J3  is  superfluous.  By  induction,  we  can  argue  that  all 
subsequent  applications  of  J3  are  superfluous.  Hence  there  is  a  proof  of  F2  in  Janas’  system  that 
does  not  use  J3. 

Now,  to  show  that  this  is  impossible,  suppose  t  does  not  have  an  application  of  J3.  By  part  (1) 
of  Lemma  ??,  t  has  a  single  independency  thread.  In  order  to  apply  J2  non-trivially,  the  left  hand 
side  of  the  independency  must  have  more  than  one  attribute.  But,  in  our  application  we  start  with 
a  single  attribute  set  X  and  it  does  not  change  if  the  only  rules  applied  are  J1  and  J2.  Hence  the 
rule  J2  cannot  be  applied  to  our  situation.  The  only  rule  applicable  is  Jl.  But  then  XD  Y,  which 
is  a  contradiction  because  X  and  Y  are  distinct  single  attribute  sets.  ■ 

Proof  of  the  Merging  Lemma  (Lemma  5): 

Case  1: 

To  show  the  merging  of  Rule  FI 1,  suppose  a  proof  segment  of  t  is  as  follows: 

a 

V^U2  V  ft  YU\U2  fr 
V  -*Ui  V  ft  YUl  FR 

VftY 

t" 

This  proof  fragment  is  equivalent  to  the  following: 

t' 

V  -4  U1U2  V  ft  YU\U2  fr 

VftY 
t" 


Case  2: 

To  show  the  merging  of  F72,  suppose  there  is  a  fragment  of  t  of  the  following  form. 
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*1  t 

t2  X-*Yx  X-/>  Z  FR 

Y\  -4  Y%  Yx-frZ  FR 

Y2^Z_ 

h 

Then  it  is  equivalent  to  the  following  fragment: 

t\  <2 

X->Yx  Yi^Y2  t 

X-+Y2  X-frZ  FR 

Y2±_Z 
h 

Case  3: 

To  show  the  merging  of  FI 3,  suppose  there  is  a  proof  fragment  with  two  successive  applications  of 
FI 3  of  the  following  form: 


tl  t_2 

<3  Y->Z  X  ■/*  Z  FR 

U\-¥Y  X  7 4  y  FR 

XjMh 

u 

It  is  equivalent  to  the  following  proof  fragment  with  a  single  application  of  FR: 

tz  h 

Ui-+Y  Y  Z  t2 

Ui->Z  X  -frZ  FR 

XJth 

u 


Proof  of  the  Interchangeability  Lemma  (Lemma  6): 

Suppose  there  is  a  proof  fragment  of  the  following  form,  where  FR  is  applied  following  an  application 
of  FR : 

Case  1:  (Interchangeability  of  FI2  and  FI 3 


<1  <2 

t3  X^Y  X  7 4  Z  FR 

W^Z  Y  -frZ  FR 

Y-/>W 

U 

This  proof  fragment  is  equivalent  to  the  following  proof  fragment. 

h  h 

ti  W^Z  X-frZ  FR 

X^Y  X  yV  W  FR 

YJ>W_ 

<4 
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Conversely,  a  proof  fragment  of  the  later  form  is  equivalent  to  a  proof  fragment  of  the  earlier 
form. 

Case  2:  (Partial  Interchangeability  of  FI 1  and  FI2) 

Suppose  there  is  a  proof  fragment  of  the  following  form,  where  FI 1  is  applied  following  an 
application  of  FF1: 


<2  tz 

<i  W->V  W-frYX  FI  2 

V  -4  X  V  -fa  YX  FI  1 

V  -frY 

U 

Above  proof  fragment  is  equivalent  to  the  following  proof  fragment  in  which  the  order  of  application 
FR  and  FT2  are  reversed. 

<2  h 

W->  V  V  X  tz 

W  -±  X  W  t4  YX  t2  FI  1 

W-frY  W-+V  FI  2 

V-frY 

u 


Case  3:  (Partial  Interchangeability  of  FH  and  FIS) 

Suppose  there  is  a  proof  fragment  of  the  following  form,  where  FI 3  is  applied  following  an 
application  of  FR: 


_  h  <2 

tz  V^X  V  -/>  YX  FI  1 

W^Y  V-/>Y  FI  3 

V  -frW 

u 

Above  proof  is  equivalent  to  the  following  proof  fragment,  in  which  the  order  of  application  of  FI 
1  and  FI  3  are  reversed. 


ti  tz 

V^X  W  Y 

VW^XW  XW  -4  YX  t2 

VW^YX  V-/>YX  FIS 

V^V  V-frVW  FI  1 

Vy^W 
t4 


Proof  of  the  Normal  Block  Merging  Lemma  (Lemma  9): 
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In  the  degenerate  cases,  reduction  of  all  possible  combinations  of  two  normal  blocks  to  a  single 
normal  block  is  shown  below.  The  blocks  are  assumed  to  be  in  the  order  a  followed  by  6.  Due  to  the 
large  number  of  cases  we  use  the  following  notation.  The  application  of  functional  independency 
rules  in  each  block  is  denoted  by  the  block  name  subscripted  by  the  rule  number;  for  example 
03)  ai,  <Z2  means  that  in  the  independency  thread  of  a,  the  application  of  independency  rules  are  in 
the  order  FI 3,  FR  and  FT2.  In  this  notation,  we  can  denote  all  possible  types  of  normal  blocks  a  and 
b  can  be  in,  for  example,  where  they  may  or  may  not  contain  applications  of  all  the  independency 
rules  in  their  independency  threads.  We  denote  these  cases  by  the  digital  equivalent  of  the  binary 
pattern  where  a  1  denotes  the  application  of  a  rule  and  a  0  denotes  its  absence  in  the  normal  block. 
For  example,  Pattern  56  corresponding  to  the  binary  pattern  -  101  110,  stands  for  the  case  where 
block  a  has  FIS,  does  not  have  FIX,  and  has  FI2,  and  block  b  has  FIS,  and  FIX,  but  no  FI2. 
We  show  the  application  of  lemmas  5  and  6  through  the  stages  in  the  reduction  of  the  two  normal 
blocks  into  an  equivalent  normal  block. 

Pattern  77: 


Lemma  5 


030102636162 >-a6  030163026162  Le^?a6  036301026162  iei^a6  036301610262  1 — *  C3C1C2 

Pattern  76: 

Lemma 6 


Lemma6 


0363010261 


*  ,  ^CTUTUttU  j  7 

O3O1O2O3OI  I >  O3O1O3O2OI 

Pattern  75: 

,  L  Lemma6  ,  ,  Lemma6  ,  , 

0301020302  » >  O3O1O3O2O2  1 ¥  0363010262 

Pattern  74: 


Lemrna6  *  1 

i—t  0363016102 

remma5 

' - *  C3O1C2 

030102 


,  Lemma6  ,  Lemma  6  , 

O3O1O263  *— r  O3O163O2  1 - >  ^3630102 

Pattern  73: 

_  _  _  7  7  Lemma6  »  ,  LemmaS 

0301026162  » — >  O3O161O262  1 — >  CL3C1C2 

Pattern  72: 

LemmaG 


Lemma  5 


Lemmab 


C3C1O2 


LemrnaS 


O3O1O261  03016102  <^3^102 

Pattern  71: 

r  LemmaS 

03010262  y—r  O3O1C2 

Pattern  67: 

_  1  »  1  LeTMn&f)  1  1  1  LonnjflS  , 

0301030102  1 — ►  O303Oi0i02  1 — >  C3C1O2 

Pattern  66: 

Lemma6  ,  ,  Lemma5 


,  ,  hemma  o 

O363O161  1 - ¥  C3C1 


O363O162 


Lemma  5 


03016361 
Pattern  65: 

,  t  Lemma6 
03016362  1 - > 

Pattern  64: 

*  Lemma6  * 

°3al03  i - f  O363O1 

Pattern  63: 

1  1  LemmaS  » 
03016162  » — ^  CI3C162 

Pattern  62: 

,  LemmaS 
°3°1&1  1 - ►  G3C1 

Pattern  61: 

030162 

Pattern  57: 

03O26361&2  1 — I  0363O26162 


Lemma5 


C3O162 


C3O1 


Lem77ia6 


0363610262  i' H^°5  C361C2 


Pattern  56: 

,  *  Lemma6 
03026361  1 - ^ 

Pattern  55: 

t  1  Lemma  6 

03^20302 


03630261 
0363 0262 


Lemma6 


O36361 02 


Lemma5 


C361O2 


Lemtna5 


C3C2 
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Pattern  54: 


Lemma  6  *  Lemmab 

i - f  036302  1 - »  C3O2 


030263 

Pattern  53: 

,  ,  Lemrna6  ,  ,  Lemma5  , 

O3O2O1O2  1 - >>  03610262  * - >  O361C2 

Pattern  52: 

Lemmab 


036102 


030261 

Pattern  51: 

*  Lemmab 
O3O262  »— r  O3C2 

Pattern  47: 

it»  LcnifjiflS  »  i 

O3636162  *—T  C36162 

Pattern  46: 

,  »  Lemma 5  » 

030301  1 — >  C3O1 

Pattern  45: 

t  ,  LemmaS  , 
O3O3O2  ' — >  C3O1 

Pattern  44: 

,  LemmaS 
0303  1 - >  C3 

Pattern  43: 

036162 

Pattern  42: 

0361 

Pattern  41: 

03&2 

Pattern  37: 
0102636162 
Pattern  36: 


LemrnaS 


O361 02 


I<emma6  177  Lemma6  ,  »  ,  Xem?na6  ,  ,  ,  Lemma  5 

* — ►  O163O26162  * — >  63O1O26162  1 - ^  6301610262  1 - ^ 


Lemrna6 


_  7  _  7  Lermna6  ,  ,  Lemma6  ,  ,  Lemma5  » 

01630261  1 - ^  63010261  i - >  63O161O2  1 - >  630102 


Oi  O26361 

Pattern  35: 

Lemmab  ,  ,  Lemma6  ,  L  Lemmab  , 

O1O26362  1 - ^  O163  O262  1 - ^  63O1O262  •— ->  63O1C2 

Pattern  34: 

Lemma6 


Lemmab 


630102  Le*^a5  630102 


,  Lernmao  , 

O1O263  1 — 016302 

Pattern  33: 

7  1  LcniwicS  1  *  Lemmab 

01 0261 62  * — y  01 61 0262  >  C1C2 

Pattern  32: 

_  7  Lemrna6  ,  Lemrna5 

010261  * — y  016102  • — y  C102 

Pattern  31: 

t  Lemmab 
O1O262  * - *  OiC2 

Pattern  27: 

771  Lemmab  *  1  1  Lemmab  1  » 

01636162  1 — y  63016162  * — y  63C162 

Pattern  26: 

7  *  Lemmab  *  1  L einmab  » 

016361  1 — y  630161  1 — y  63C1 

Pattern  25: 

t  7  Lemmab  ,  , 

O16362  1 - y  630162 

Pattern  24: 

,  Lemmab  , 

O163  i - y  63  Oi 

Pattern  23: 

7  *  Lemmab  7 

01 61 62  1  y  Ci 62 

Pattern  22: 

,  Lemmab 
O161  I - ^  Ci 


63C1C2 
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Pattern  21: 

°1&2 

Pattern  17: 

iri  Lemma6  ,  L  ,  Lemma 6  ,  ,  ,  Lemma5  ,  , 

fl2^3^1^2  i““^  63(^26162  1 - >  63610262  1 - >  6361C2 

Pattern  16: 

,  ,  LemmaG  7  7  Lemma6  r  7 

^26361  1  >  63O261  >  >  6361 02 

Pattern  15: 

i  L  Lemma 6  *  *  Lemma5  * 

^2^362  1  >  63O262  1  >  63C2 

Pattern  14: 

,  Lemma 6  , 

O2O3  1 — ^  O3O2 

Pattern  13: 

Lemr?ja6  ^  t  LemmaS 


7  1  ^C7«T«UO  7  1 

O2O1  ©2  O1O2O2 

Pattern  12: 

,  LemmaS  , 

0261  1 — >  61 02 


bl  C2 


Pattern  11: 

t  Lemma  5 
O262  1  r  C2 


Proof  of  the  Merging  Lemma  for  Functional  Dependencies  (Lemma  10) 

We  first  show  that  two  successive  applications  of  FD2  can  be  reduced  to  a  single  application  of 
FD2.  Suppose  two  successive  applications  of  FD2  are  as  follows: 


t' 

W  C  V  X^Y  FD2 

Wi  QVi  XV  -»•  Y W  FD2 

XVVi  ->•  YWWi 
t" 

This  can  be  replaced  by  the  following  proof  fragment,  which  has  only  one  application  of  FD2. 


U 

WWx  CVVi  X  ->Y  FD2 

XVVi  ->  YWWi 
t" 

Then,  by  induction,  the  general  result  follows.  ■ 

Proof  of  the  Interchange  Lemma  for  Functional  Dependencies  (Lemma  11) 

Suppose  the  following  proof  fragment  is  an  application  of  FDZ,  FD2. 

_ h _  h 

X  ->  Y  Y  -)■  Z  FDZ 

WCV  X^Z  FD2 

XV  ->ZW 
t" 

The  it  can  be  replaced  by  the  following  proof  fragment,  in  which  the  proof  rules  appear  in  the 
reverse  order  FD2,  FDZ. 
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wcv 


tl 

X  -rY  FD2 


WCW 


h 

Y  Z  FD2 


XV  ->YW  YW  C  WZ  FD3 

XV  -4  WZ 

f 


Proof  of  Structural  Property  of  Transitive  Envelopes  (Theorem  3): 

The  proof  is  by  induction  on  the  structure  of  the  proof  tree,  T. 

For  the  base  case,  if  the  last  proof  rule  applied  is  FD2,  then  T  is  null.  If  the  last  proof  rule 
is  FD3,  and  the  rules  that  are  applied  at  levels  immediately  higher  are  not  FD3,  then  T  is  of  the 
following  form. 


ti  ti 

X^Y  Y  Z  FD3 

X  -4  Z 

Then  T  is  (X  -4  Y),  {Y  — >  Z),  which  satisfies  Theorem  3. 

Now,  for  the  inductive  case  assume  that  proof  trees  above  t\  and  <2  have  transitive  envelopes 
{A\  -4  A2), . . . ,  (An_i  -4  An)  and  (An+ 1  -4  An+2),  •  •  • ,  (Am_i  -4  Am).  By  the  inductive  hypothe¬ 
sis,  Ai  is  X,  An  is  Y,  An+ \  is  Y  and  Am  is  Z.  Hence  (Ai  -4  A2), . . . ,  (Am_i  -4  Am)  is  the  transitive 
envelope  of  t.  ■ 

Proof  of  Repetition  Removal  from  Transitive  Envelopes  (Theorem  4): 

Suppose  the  chain  (A\  -4  A2), {An  -4  An+ 1)  is  a  transitive  envelope  of  a  proof  t  of  (Ai  -4 
An+i)  and  (X  -4  Y)  appears  in  (Ai  -4  A2), . . . ,  (An  ->■  An+1)  more  than  once.  The  aim  is  to  get 
a  proof  t'  of  (Ai  -4  A„+i)  using  of  the  same  set  of  assumptions  (possibly  a  subset  thereof)  as  that 
of  t,  but  without  repeated  occurrences  of  (X  -4  Y)  in  the  transitive  envelope  of  t'.  Also  suppose 
that  t{  is  the  sub-proof  tree  that  has  (A,-  -4  Aj+1)  as  its  root  (conclusion)  in  the  proof  tree  t. 

Suppose  the  first  and  last  occurrences  of  (X  -4  Y)  in  the  transitive  envelope  (Ai  -4  A2), . . . ,  ( An  -4 
An+ 1)  are  respectively  (Aa  -4  Aa+i)  and  (Ab  -4  Ab+ 1). 

Case  1:  (a  >  1  and  b  <  n  —  1) 

Then,  there  is  a  FD  proof  t'  of  {A\  -4  An+i)  that  uses  only  FD3  as  proof  rules  and  (Ai  -4 
A2), . . . ,  (A„_  1  -4  A0),  (Ai,+j  -4  At+2) . . .  (An  -4  An+i)  as  assumptions.  Then  the  following  is  a 
proof  of  (Ai  -4  An+ 1)  from  the  same  set  of  assumptions  (possibly  less). 

_ ■  *  ■  ^^^£4 _  _ ^6+1  •  *  *  tfi 

-^1  -^2  •••  At- 1  X  X  — >  Y  Y  -4  Aj,_[_2  ...  An  — >  Aw+1 

•  •  •  •••  •••  •••  •••  ••• 

l 

-A  ->  An+ 1 


Case  2:  (o  =  1  and  b  <n) 

Following  is  a  proof  of  {A\  -4  An+i)  form  the  same  set  of  assumptions  (possibly  less). 
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4 

x  -+y 


4+i 

Y  -+  A6+2 


An  An+\ 


A\  -+  An+ 1 


Case  3:  (a  >  1  and  b  =  n) 

Then  the  following  is  a  proof  of  {A\  — >•  A„+i)  form  the  same  set  of  assumptions  (possibly  less). 

t\  ...  4 

...  x  ->y 

A\  — >  An+i 

Case  4:  (o  =  1  and  b  =  n) 

Then  (A\  -+  An+i)  is  the  proof  of  itself,  as  Ai  is  X  and  An+\  is  Y.  ■ 

Proof  of  Lemma  12  : 

(1) 

Following  proof  in  semi-normal  form  suffices. 


ycy  y  ->■  a  ycy  A-+xy 

wcv 

y  ->•  y  A  YA  -+  XVY 

YW  C  XVY 

y-^xyy 

XVY  ->■  YW 

Y^YW 

(2) 

The  following  proof  in  semi-normal  form  suffices. 

wcv 

YWCYV  YCY  YW  -+  XA  BCA 

YV  -+  YW  YW  -+  YXA  YB  C  FXA 

yy  -+  yxA  yxa  -+  y^ 

YV-+YB 


Proof  of  Repetition  Removal  from  Assumptions  (Theorem  5): 

Suppose  that  t  is  a  proof  of  (£?  -+  F)  in  FD  and  that  the  assumption  ( X  -+  y)  is  used  more 
than  once  in  t.  Let  t'  be  the  semi-normal  form  of  t.  Notice  that  the  set  of  assumptions  used  in  t'  is 
a  subset  of  the  set  of  assumptions  used  in  t.  Suppose  that  (X  -+  Y)  appears  more  than  once  in  i*. 

If  all  multiple  uses  of  (X  — »  Y)  as  assumptions  occur  in  the  transitive  envelope  of  t/,  then  by 
Lemma  4,  there  is  a  proof  t"  that  does  not  repeatedly  use  (X  — >  Y)  as  an  assumption.  Hence,  we 
need  to  prove  that  such  repeated  occurrences  can  be  eliminated  only  when  not  all  of  them  occur  in 
the  transitive  envelope.  To  prove  so,  we  consider  pairs  of  such  duplicates,  where  not  both  of  them 
occur  in  the  transitive  envelope,  and  reduce  the  proof  so  that  the  reduced  proof  contains  only  one 
occurrence,  instead  of  two  of  them.  In  order  to  do  so,  let  T\  TE'  be  the  ordered  listing  of  the 
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assumptions  and  the  transitive  envelope  of  i!  respectively. 

Case  1:  Suppose  (X  4  Y)  occurs  successively  in  T',  once  as  an  antecedent  to  FD2  and  next 
as  an  antecedent  to  FD3  (hence  this  occurrence  is  included  in  the  transitive  envelope)  in  that 
order.  Hence  there  is  some  subsequence  ( X  -4  Y),  (Y  4  Ax),  (Ax  -4  A2) . . .  (An_i  -4  An),  (An  -4 
XV),  (XV  -4  YW)  in  TE'  where  the  dependency  (XV  -4  YW)  is  a  consequent  of  the  rule  FD2 
applied  to  some  W  C  V  and  (X  -4  Y),  as  shown  below. 

tx  ty  U  tn  W  C  V  X  -4  Y 

147  Y4i!  ...  Aj-tAj+i  ...  An4XF  XF  -4  YW 

E  ^  F 


Let  the  subsequence  (Y  4  Aj),  (Ai  4  A2), . . .  (A„_ 1  4  An),  (A„  -4  XF),  (XF  -4  YW)  in  TE' 
be  denoted  by  SUB.  Also,  let  tn, t"  be  the  subproofs  of  t'  that  have  these  dependencies 

as  conclusions.  Then  (Ax  -4  A2),...(An_ x  -4  An),(An  -4  XV)  is  a  chain,  and  thus  by  Lemma 
13,  there  is  a  proof  tx  of  (Ax  -4  XV)  from  t\, . . .  tn.  By  Lemma  12  (  1),  there  is  a  proof,  say 
ts  of  (Y  -4  YW)  from  (Y  -4  A\),(A\  -4  XF)  and  W  C  F,  that  uses  tx-  Notice  that  the 
proof  tx  does  not  use  (X  -4  Y)  as  an  assumption.  Now  suppose  TE'  =  TE\,SUB,TE2.  Then 
TE"  =  TE\,  (Y  -4  YW),TE2  is  a  chain  of  dependencies.  Therefore,  by  Lemma  13  there  is  a  proof, 
say  t  finai  of  (E  -4  F)  from  dependencies  in  TE",  using  proof  fragments  of  if  that  have  dependencies 
of  TE\ ,  T E2  as  conclusions  and  tx-  Notice  that  since  tx  does  not  use  (X  -4  Y)  as  an  assumption, 
t final  use  one  less  instance  of  (X  -4  Y)  as  an  assumption  than  if .  The  structure  is  shown  below. 


TEx  , . . . s  XEt. 

r^~>(Y  4-  A\)  (Ax  4-  A2)...(A4XF)  (XF  4^  YW) 'r?T' 


Case  2:  Suppose  (X  4-  Y)  occurs  successively  in  T' ,  once  as  an  antecedent  to  FD3  (and  thus  is 
included  in  the  transitive  envelope)  and  again  as  an  antecedent  to  FD2  in  that  order.  Then  there  is 
some  subsequence  (XF  4-  YW),  (YW  4  Ai),  (Ax  4  A2), ...,  (An- x  An),  (An  4  X),  (X  4  Y), 
say  SUB  in  TE',  where  the  dependency  (XF  4  YW)  is  a  consequence  of  applying  FD2  to  some 
W  C  F  and  (X  4  Y),  as  shown  below. 


WCF  X  4  Y 

XF4  YW  YW4Ax  Ai  4  A2  ...  Aw  4  X  X  4  Y 

♦  *  •  *  •  * 

4  F 


Then  (XF  4  X)  is  derivable  by  FDl,  because  X  C  XF.  Suppose  TE'  =  TEUSUB,TE2. 
Then  TEx,  (XV  4  X),  (X  4  Y),TE2,  is  a  chain.  The  proof  fragments  that  prove  dependencies 
in  TE\,TE2  are  subproofs  in  t'.  By  Lemma  13,  we  get  a  proof,  say  tfinai  of  (E  4  F)  that  has  one 
less  occurrence  of  (X  4  Y)  as  an  assumption,  because  proofs  fragments  that  had  conclusions  in 
SUB  used  (X  4  Y)  as  an  assumption,  which  is  not  there  in  tfinai- 
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Case  3:  Suppose  (X  -4  V)  occurs  successively  in  t\  both  as  antecedents  to  FD2.  Then  there  is  a 
subsequence,  {XV  -4  YW),  ( YW  ->  Ai),  (A:  -4  A2) . . .  (An_i  -4  An),  (An  -4  XA),  (XA  -4  KB) 
in  TE’,  say  S77B,  where  (XV  -4  VtV)  is  a  consequence  of  applying  FD2  to  some  W  C  V  and 
(X  -4  y)  and  (XA  -4  VB)  is  a  consequence  of  applying  FD2  to  some  B  C  A  and  (X  -4  V),  as 
follows. 


TV  C  V  X  -4  Y  t\  t2  tn  X  -4  V 

XV  -4  VtV  yw~»  Ai  A!  -4  A2  . . .  An  -4  XA  XA  -4  VB 


B4F 

Then,  by  Lemma  13,  there  is  a  proof  tr  of  (ytV  -4  XA)  from  {YW  -4  Ai),  (Ai  -4  A2), . . . ,  (An_i  -4 
An),  (A„  -4  XA)  that  uses  the  same  proof  fragments 

Then  by  Lemma  12  (  2),  there  is  a  proof,  say  tr  of  (V V  -4  YB)  from  (FtV  -4  XA),  IV  C  V  and 
B  C  A.  Hence  the  following,  say  tv,  is  a  proof  of  (XV  -4  YB)  using  (X  -4  Y)  as  an  assumption 
only  once. 

jv  c  v  x  -4  y  tT 

xv  -4  yv  y  v  -4  yb 

xv  -4  ys 

T E"  =  TBi,  (XV  — >  YB),TE2  is  a  chain  of  dependencies.  Hence  by  Lemma  13,  there  is  a 
proof,  say  tjinai  of  {E  — t  F)  from  the  subproofs  of  t'  that  have  dependencies  in  TE\,TEi,  and  ty 
as  assumptions.  Notice  that  tfinai  has  one  less  occurrence  of  (X  -4  Y)  than  t'  because  tT  used  it 
as  an  assumption  only  once.  The  situation  is  as  follows. 

SUB 

TEi - - — - - 

^(XV  -4  YW)  {YW  -4  Ai)(A2  -4  A2) . . .  {An  -4  XA)  (XA  -4  YB)r 

V . .  V  .  ■'  ■  ' 

TE 


TE\  s- 


iv 


-S TEi 


r?>(XV  -4  YB)^ 


TE" 


Proof  of  the  Theorem  on  Properties  of  Inverse  Proofs  (Theorem  6): 

Case  1:  t  Proves  the  functional  dependency  (X  -4  Y) 

The  proof  is  by  induction.  For  the  base  case  where  t  is  (X  -4  Y),  i-1  is  (X  -fa  Y).  In  case  t  is 
a  single  application  of  FD2  or  FD3,  the  result  follows  from  the  proof  of  Lemma  14. 

In  case  t  consists  of  more  than  one  application  of  a  rule,  let  the  proof  fragment  which  corre¬ 
sponds  to  the  application  of  the  first  proof  rule  be  /,  and  its  conclusion  be  (A  -4  B),  and,  say 
{P  -4  Q )  is  at  the  top  of  7.  Then  (A  -4  B)  is  the  head  of  7'  and  t'  proves  (X  -4  Y).  By  the 
inductive  hypotheses,  t'~1{ Y)  proves  (A  t4  B)  where  (X  yV  Y)  is  at  the  top  of  its  independency 
thread.  By  Lemma  14  the  inverse  of  /,  say  /-1  has  (A  7A  B)  as  the  head  of  the  independency 
thread  and  {P  -fa  Q)  as  the  consequent.  Hence  consequent  of  /-1  can  be  fused  to  the  head  of  the 
independency  thread  of  t,-1( 7^,  to  derive  the  desired  result  t-1(7). 
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Case  2:  t  proves  functional  independency  (X  -/*  Y) 

Once  again,  the  proof  is  by  induction.  For  the  base  case  where  t  is  ( X  -ft  Y),  t-1  is  (X  -4  Y). 
In  case  where  the  result  where  t  is  a  single  application  of  FI 1,  P/2  or  PD3,  the  result  follows  from 
the  proof  of  Lemma  14. 

For  the  inductive  case,  the  proof  is  similar  to  Case  1,  except  that  we  use  the  unique  indepen¬ 
dency  thread.  ■ 

Proof  of  the  Inconsistency  Test  (Lemma  15): 

If  E  h  (X  -4  Y)  for  some  independency  (X  -/>  Y)  £  S',  then  by  Definition  16,  E  U  E'  is 
inconsistent. 

To  prove  the  converse,  suppose  SUE'  is  inconsistent.  Then  by  definition  16,  there  are  attribute 
sets  P  and  Q  such  that  SUS'h  (PA  Q)>  (P  ->  Q)-  Then,  by  the  normal  form  theorem,  there  is 
a  normal  proof  t  of  (P  ■/>  Q)  from  E  U  S'.  In  the  following  case  analysis  we  show  that  this  always 
leads  to  the  desired  result. 

Case  1:  Suppose  all  rules  P/3,  FR,  and  P/2,  are  applied  in  the  independency  thread  of  t. 

Then  the  proof  is  of  the  following  form. 

E 

s  ACt±Y _ X±Y_ 

E  X  -4  A  X  -fr  AQ 

X^P  X-frQ 

P-frQ 

Hence  we  get  E  h  (I  4  P),  (P  -4  Q),  (X  -4  A),  ( AQ  -4  Y). 

E  h  (X  4  7)  from  Armstrong’s  axioms,  where  (X  -ft  Y)  E  E\ 

Case  2:  Suppose  the  application  of  independency  rules  were  restricted  to  only  FI3  and  FR. 

Then  only  the  first  two  proof  rules  are  relevant.  Therefore  we  get  that  P  is  X,  and  therefore 
E  b  (AQ  -4  Y),  (X  ->■  A),  (X  Q).  Consequently,  we  get  E  U  E'  F  (X  4  Y),  for  (X  7 4  Y)  €  E'. 

Case  3:  Suppose  the  application  of  independency  rules  are  P/3,  FI 2. 

Then  the  proof  is  of  the  following  form. 

E 

S  Q^Y  X-frY  FI 3 

X  ^  P  X  Q  FT1 

P^Q 

Hence  we  get  E  h  (X  4  P),  (P  -4  Q),  ( Q  -4  V).  Consequently,  we  get  E  h  (X  4  7)  from 
Armstrong’s  axioms,  for  (X  7 4  F)  €  S'. 

Case  4:  Suppose  the  application  of  independency  rules  are  P/1,  P/2. 

Then  the  proof  of  (P  ■/*  Q)  is  of  the  form  given  below.  In  the  proof  we  see  that  Y  must  be  of  the 
form  TQ  for  some  attribute  set  T. 

E 

E  X  — >•  T  X-frTQ  FR 

X^P  X-frQ  P/2 

P-frQ 


FI 3 
FR 
P/2 


Consequently,  we  get 
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Hence  we  get  E  h  (X  -4  P),  (P  -4  Q),  (X  T)).  Consequently,  we  get  E  I-  (X  -4  TQ)  from 
Armstrong’s  axioms,  for  ( X  fa  TQ)  G  S'. 

Case  5:  Suppose  the  only  independency  rule  applied  is  P/3. 

Then  only  the  first  application  of  the  proof  rule  in  case  1  is  relevant  and  we  get  that  P  is  X, 
AQ  is  Y,  and  E  h  (QU  -4  Y),  (X  -4  Q).  Consequently  by  applying  Armstrong’s  axioms  we  get 
E  U  E'  h  (X  Y"),  for  (X  fa  Y)  £  S'. 

Case  6:  Suppose  the  only  independency  rule  applied  is  P/1. 

Then  only  the  first  two  lines  of  the  proof  of  case  4  are  relevant.  Then  P  is  X  and  Y  is  TQ.  Hence 
we  get  that  E  b  (P  -4  T),  (P  -4  Q),  and  by  Armstrong’s  axioms  we  get  Eh  (P  -*•  TQ),  i.e. 
E  b  (X  -4  Y),  for  (X  fa  Y)  G  S'. 

Case  7:  Suppose  the  only  independency  rule  applied  is  P/2. 

Then  the  proof  of  (P  fa  Q)  is  of  the  following  form. 

X  -4  P  X  Q  P/2 

P  faQ 

Then,  we  get  that  Y  is  Q  and  that  E  b  (X  -4  P),  (P  -4  Q),  and  by  Armstrong’s  axioms  get 
EH(X->y).  ■ 

Proof  of  Consistency  when  adding  a  dependency  (Lemma  16): 

Suppose  EuE'll{X  -4  V)  is  not  consistent.  Then  by  Lemma  15,  there  is  an  independency 
(P  fa  Q)  G  E'  such  that  E  U  {X  -4  V}  h  (P  — >  Q).  Because  E  U  S'  is  consistent,  the  proof  of 
(P  -4  Q)  must  use  (X  -4  Y)  as  an  assumption.  Then,  by  Theorem  5,  there  is  a  proof  of  (P  -4  Q) 
from  assumptions  E  U  {X  -4  y}  that  uses  (X  — ►  Y)  only  once  as  an  assumption.  Call  this  proof 
t.  Let  7  be  the  proof  thread  of  t  that  begins  at  the  assumption  (X  — >  y).  Then,  by  Theorem  6, 
f-1  (7)  is  a  proof  of  (X  fa  Y)  from  the  assumptions  E  U  (P  fa  Q}.  Because  (P  fa  Q)  G  E', 
E  U  E'  b  (XT  fa  y),  contradicting  the  hypotheses  of  the  Lemma.  ■ 

Proof  of  Consistency  when  adding  an  independency  (Lemma  17): 

Suppose  E  U  E'  \f  (X  -4  Y)  and  E  U  S'  U  {X  fa  y}  is  inconsistent.  Then  there  a  dependency 
(P  -4  Q)  such  that  E  b  (P  — ¥  Q)  with  SUE' U {X  fa  y}  b  (P  fa  Q).  By  the  normal  form  theorem 
(i.e.  Theorem  1),  there  is  a  normal  proof  t  of  (P  fa  Q)  from  EUE'U  {X  fa  Y}.  Notice  that  a 
normal  proof  applies  independence  rules  in  the  order  P/3,  P/1,  P/2. 

Case  1:  Suppose  all  rules  P/3,  FI 2,  P/1  are  applied  in  the  independency  thread  of  t. 

Then  the  proof  is  of  the  following  form. 

E 

E  AQ^Y  X  fa  Y 

E  X  -¥  A  X  fa  AQ 

X^P  X  fa  Q 

P*Q 

Hence  we  get  E  b  (X  ->•  P),  (P  -4  Q),  (X  ->•  A),  (AQ  -4  Y). 

E  b  (X  -4  y)  from  Armstrong’s  axioms,  contradicting  E  U  S'  1/  (X  4  Y). 


FI 3 
P/1 
P/2 


Consequently,  we  get 
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Case  2:  Suppose  the  application  of  independency  rules  are  P/3  and  FI 1  in  that  order. 

Then  only  the  first  three  lines  of  the  above  proof  are  relevant,  and  hence  we  get  that  P  is  X,  and 
therefore  E  h  (QA  — >■  Y),  ( X  -t  A ),  (X  -¥  Q).  Consequently,  we  get  E  U  S'  f-  (X  Y)  for  a 
contradiction. 

Case  3:  Suppose  the  application  of  independency  rules  are  FI 3,  FT1  in  that  order. 

Then  the  proof  is  as  follows. 


E 

E  Q->Y  X7 4Y  FI 3 

X  -*•  P  X  yV  Q  P/2 

P*Q 

Hence  we  get  E  h  (X  -4  P),  (P  -4  Q),  (Q  Y).  Consequently,  we  get  E  b  (X  — >  Y)  from 
Armstrong’s  axioms,  contradicting  E  U  E'  \f  {X  -¥  Y). 

Case  4:  Suppose  the  application  of  independency  rules  are  FI 1,  FT2  in  that  order. 

Then  the  proof  of  (P  -/*  Q)  is  of  the  form  given  below.  In  the  proof  we  see  that  Y  must  be  of  the 
form  TQ  for  some  attribute  set  T. 

E 

E  X^T  X-frTQ  FR 

X^P  X  t4  Q  FT1 

P-frQ 

Hence  we  get  E  h  {X  — >  P),  (P  — >■  Q),  (X  — >  T )).  Consequently,  we  get  E  t-  (X  TQ)  from 
Armstrong’s  axioms,  contradicting  E  U  S'  \f  (X  -4  Y). 

Case  5:  Suppose  the  only  independency  rule  applied  is  P/3. 

Then  only  the  first  two  lines  of  the  proof  in  case  1  are  relevant  and  we  get  that  P  is  X  and 
E  h  ( QA  — >■  Y),  (X  -¥  Q).  Consequently  by  applying  Armstrong’s  axioms  we  get  SUE'  h  (X  — >■  Y) 
for  a  contradiction. 

Case  6:  Suppose  the  only  independency  rule  applied  is  P/1.  Then  only  the  first  two  lines  of  the 
proof  of  case  4  are  relevant.  Then  P  is  X  and  Y  is  TQ.  Hence  we  get  that  Sh  (P  -4  T),  (P  -4  Q), 
and  by  Armstrong’s  axioms  get  E  I-  (P  — >■  TQ),  i.e.  E  I-  (X  Y)  for  a  contradiction. 

Case  7:  Suppose  the  only  independency  rule  applied  is  P/2. 

Then  the  proof  of  (P  -/>  Q)  is  of  the  following  form. 

X  P  X-frQ  P/2 

P-bQ 

Then,  we  get  that  Y  is  Q  and  that  E  h  {X  — P),  (P  -4  Q),  and  by  Armstrong’s  axioms  we  get 
E  h  (X  — >  Y)  for  a  contradiction.  ■ 
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